twitter-jsonl-tools

Simple set of Python tools for handling Twitter data in the JSONL (JSON Lines text) format, also called newline-delimited JSON. See http://jsonlines.org for more details on the format.

Dependencies

Tested with Python 3.6, and requiring the following packages, which are available via PIP:

prettytable >= 0.7.2

For faster JSON decoding, the scripts optionally support the use of the UltraJSON package.

Basic Usage: Tweets

The tools used to process tweets expect one or more JSONL files as inputs, where each line contains a JSON-formatted tweet as retrieved from the Twitter API.

To get basic summary statistics for a JSONL file containing tweets:

python jsonl-tweet-stats.py sample/sample-tweets-500.jsonl

To get list of the most frequently-tweeting users for a JSONL file containing tweets:

python jsonl-tweet-authors.py sample/sample-tweets-500.jsonl

To get list of the most frequently-mentioned users for a JSONL file containing tweets:

python jsonl-tweet-mentions.py sample/sample-tweets-500.jsonl

To get list of the most frequently-appearing hashtags for a JSONL file containing tweets:

python jsonl-tweet-hashtags.py sample/sample-tweets-500.jsonl

To export all tweets to a simple CSV (comma-separated) format:

python jsonl-tweet-export.py sample/sample-tweets-500.jsonl -o sample/sample-tweets.csv

To generate a tab-separated CSV file containing pairwise hashtag cooccurrence frequencies for one or more JSONL files:

python jsonl-hashtag-cooccur.py sample/sample-tweets-500.jsonl -o hashtag-cooccurrences.csv

Basic Usage: Users

The tools used to process user data expect one or more JSONL files as inputs, where each line contains a JSON-formatted user profile data as retrieved from the Twitter API.

To export all user metadata to a simple CSV (comma-separated) format:

python jsonl-user-export.py sample/sample-users-50.jsonl -o sample/sample-users.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

twitter-jsonl-tools

Dependencies

Basic Usage: Tweets

Basic Usage: Users

Files

README.md

Latest commit

History

README.md

File metadata and controls

twitter-jsonl-tools

Dependencies

Basic Usage: Tweets

Basic Usage: Users