Stream Tweets in real-time with v2 of the Twitter API

Tony Vu - Sep 22 '20 - - Dev Community

Introduction

The Twitter API allows you to stream public Tweets from the platform in real-time so that you can display them and basic metrics about them.

In this tutorial, you will learn how to:

  • Set an objective for your work

  • Plan and prepare to process the needed data

  • Connect and authenticate to the appropriate API endpoint

  • Handle errors and disconnections

  • Display Tweets and basic metrics about them

Prerequisites

Steps to consider

Step 1: Set an objective for your work

First, define what you want to accomplish and decide on the data you need to meet that objective.

  • Staying informed on a topic of interest: For example, you would like to stay current on updates, news, and events about Twitter’s API.
  • Detecting current trends: You have a hard time detecting current trends and would like to extract signals about what people are discussing in the world.

Step 2: Plan and prepare to process the needed data

Staying informed on a topic of interest

For the example of staying informed on a topic of interest, you will need to decide on what type of data you need and how to go about getting it in real time.

For example, in continuing with the example on keeping current with updates to the Twitter API, you will need to make sure you get Tweets from the @TwitterDev and the @TwitterAPI accounts the moment they are posted. You will also only want Tweets that contain links so that you can read up on any further context provided with a Tweet.

To get this kind of data in real time, you can use the filtered stream endpoints. The filtered stream endpoints requires you to define filtering criteria in order for it to know what kind of Tweets to send to you.

After defining your filtering criteria, you will need to apply this to the filtered stream endpoints.

Filtering criteria are applied to the filtered stream endpoints in the form of rules. Rules allow you to narrow down to only the Tweets you are looking for by using a set of operators.

While creating a filter it is often helpful to think about what type of data you don’t want to receive and work backward from there. You will need to make sure you have the right keyword and adjust your rule to prevent unwanted Tweets from entering into your stream. For examples of how you can build more complex rules, see our documentation on building a rule.

Based on the filtering criteria just defined, you can create a rule containing the following operators.

  • from:

  • has:links

The “from:” operator matches any Tweet from a specific user and the “has:links” operator matches Tweets containing links in the Tweet body. Together, these operators form the following rule which instructs the filtered stream endpoints to filter for Tweets from the accounts @Twitterdev or @TwitterApi that contain links.

(from:twitterdev OR from:twitterapi) has:links
Enter fullscreen mode Exit fullscreen mode

To add this rule, issue a POST request to the filtered stream rules endpoint with an added payload as an array of rules and operators. The example below shows how you can use cURL to do so. To authenticate, replace $BEARER_TOKEN (including the dollar sign) with the Bearer Token from your App in the developer portal

curl -X POST 'https://api.twitter.com/2/tweets/search/stream/rules' \ -H "Content-type: application/json" \ -H "Authorization: Bearer $BEARER_TOKEN" -d \ '{  "add": [    {"value": "(from:twitterdev OR from:twitterapi) has:links"}  ] }'`
Enter fullscreen mode Exit fullscreen mode

You can also POST to the rules endpoint using one of our code samples

Detecting current trends

For the example of detecting current trends, you will need to consider what kind of data you need and how much data you need in order to perform your analysis. For example, you may need to read and analyze a broad, but relatively manageable set of Tweets. This data should also be current or relatively current. While the text of a Tweet will likely be important for your analysis, there are other data elements you will need to consider. Within the text of a Tweet itself, you will also need to consider if you need hashtags, mentions, or if you are looking for certain keywords.

Beyond the text of a Tweet, a Tweet contains several different fields of data so you will want to decide which fields would best aid your trend detection needs. Based on the fields you decide on, you can then, for example, perform some basic frequency analysis on keywords, hashtags, mentions, or Tweet annotations made available in a Tweet payload. By default, you will only get back the id and text of each Tweet. If you would like additional data returned about each Tweet, you can add additional fields and expansions to your request URL.

After formulating your data requirements, you will need to figure out how to go about getting this data. To get the data for this analysis, the 1% random sample of public Tweets provided by the sampled stream endpoint can meet this need since it provides a small subset of data relative to the total amount of public Tweets. Additionally, the data is sent to you in real time as it happens, which will meet the requirement of the data being current.

Step 3: Connect and authenticate to the appropriate endpoint

Staying informed on a topic of interest

Once you have defined your data and set your filtering criteria using rules, you can connect to the filtered stream streaming endpoint to start getting data. The example below shows how you can use cURL to do so. To authenticate, replace $BEARER_TOKEN with the Bearer Token from your App in the developer portal.

curl -X GET -H "Authorization: Bearer $BEARER_TOKEN" "https://api.twitter.com/2/tweets/search/stream"
Enter fullscreen mode Exit fullscreen mode

You can also POST to the streaming endpoint using one of our code samples

Detecting current trends

To connect to the sampled stream endpoint, issue a GET request to the endpoint. The example below shows how you can use cURL to do so. To authenticate, replace $BEARER_TOKEN with the Bearer Token from your App in the developer portal.

curl -X GET -H "Authorization: Bearer $BEARER_TOKEN" "https://api.twitter.com/2/tweets/sample/stream"

You can also connect to the sampled stream endpoint using one of our code samples

Step 4: Handle errors and disconnections

At any time while you are connected to either the filtered stream endpoint or the sampled stream endpoint, you may be disconnected either voluntarily or involuntarily. Voluntary disconnects occur when you independently terminate the connection to the streaming endpoint, whether because your code actively closes the connection, or where network settings terminate the connection. Involuntary disconnections occur when either streaming endpoint actively disconnects your App from the stream. These types of disconnections occur due to one of the following reasons.

  • Full Buffer: Your App is not reading the data fast enough, or a network bottleneck is slowing data flow.

  • Too many connections: Your App established too many simultaneous connections to the data stream. When this occurs, the filtered stream endpoint will wait 1 minute, and then disconnect the most recently established connection if the limit is still being exceeded.

  • Server maintenance: The Twitter team deployed a change or update to the system servers

You should build the proper logic into your code to automatically reconnect in the event any of these involuntary disconnections occur. Additionally, if you only need Tweets for a limited time, you may want to consider building in the proper timing in you code to disconnect from the streaming endpoint after your desired period of time has passed.

To avoid full buffer errors due to the high volume of Tweets that you will receive, your code should not do any real processing work as it reads the stream. Read the stream and then hand the activity to another thread or process to do your processing asynchronously. This processing can include performing frequency analysis or any type of heavy processing work.

For an example of how to build in reconnection logic to your application, check out the sample apps below that use the filtered stream endpoints.

Step 5: Display Tweets and basic metrics about them

Once you are reliably getting your data, you will then need to think about what you would like to do with them. One common option is to display some of them on a web page or dashboard, along with some basic metrics to describe what’s happening over time.

You should keep in mind that id and text will always be populated since they are the basic building blocks of a Tweet, but other fields may not always be present. For instance, not all Tweets will contain image attachments. You will want to consider what fields the data you are looking for is located in, which you would like to display, and which may not always be present, dependent upon the nature of the Tweet. Lastly, you will also want to reference and comply with Twitter’s Display Requirements when displaying Tweets. Using embedded Tweets can help meet these requirements.

If you would like to keep a running total of the number of Tweets read over time and display this in a chart, you should keep in mind that the amount of Tweets can fluctuate based on how much more or less people are Tweeting at a given time. You may want to create a visualization to display these fluctuations over time. Additionally, as you are ingesting and displaying these Tweets, you may want to temporarily save them to a data store and display the most recent ones or organize them on a page by page basis due to the amount of Tweets that can come in at once and your limited screen real estate. You can also consider displaying Tweets based on their organic performance metrics such as the ones receiving the most retweets, replies, or likes.

Finally, consider that Tweets can be related to one another either because someone retweets a Tweet or replies to a Tweet. If helpful towards your objective, you may want to display Tweets in a way to indicate this relationship, if any.

Next steps

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .