📊 data-streams 📊

Publicly available real-time data sets on Kafka, Redpanda, RabbitMQ & Apache Pulsar

💬 About

This project serves as a starting point for analyzing real-time streaming data. We have prepared a few cool datasets which can be streamed via Kafka, Redpanda, RabbitMQ, and Apache Pulsar. Right now, you can clone/fork the repo and start the service locally, but we will be adding publicly available clusters to which you can just connect.

📂 Datasets

Currently available datasets:

⏩ How to start the streams?

Place yourself in root folder and run:

python3 start.py --platforms <PLATFORMS> --dataset <DATASET>

The argument <PLATFORMS> can be:

kafka,
redpanda,
rabbitmq and/or
pulsar.

The argument <DATASET> can be:

github ,
art-blocks ,
movielens or
amazon-books.

That script will start chosen streaming platforms in docker container, and you will see messages from chosen dataset being consumed.

You can then connect with Memgraph and stream the data into the database by running:

docker-compose up <DATASET>-memgraph

For example, if you choose Kafka as a streaming platform and art-blocks for your dataset, you should run:

python3 start.py --platforms kafka --dataset art-blocks

If you are a Windows user and the upper command doesn't work, try replacing python3 with python.

Next, in the new terminal window run:

docker-compose up art-blocks-memgraph

📜 References

There's no documentation yet, but it's coming soon! Throw us a star to keep up with upcoming changes.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
data-analysis		data-analysis
datasets		datasets
kafka		kafka
memgraph		memgraph
stream		stream
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
platform_variables.env		platform_variables.env
start.py		start.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 data-streams 📊

💬 About

📂 Datasets

⏩ How to start the streams?

📜 References

About

Contributors 3

Languages

License

memgraph/data-streams

Folders and files

Latest commit

History

Repository files navigation

📊 data-streams 📊

💬 About

📂 Datasets

⏩ How to start the streams?

📜 References

About

Resources

License

Stars

Watchers

Forks

Contributors 3

Languages