Working project.....

Auto scalable (AWS) cluster to work with topics from kafka cluster:

Working project.....

Master responsabilities:

Verify new topics
Register new workers(nodes)
Ask workers about load , memory usage , and network
Use the node info to decide where to send a new job
Supervisor of workers
Save configuration about how to partition the data provided by kafka. For example, If the message has a date field it can divide the message by hour or by directory partition like /date/client/product
Rest api to config a topic. Ex: (MaxMessageInFlight, how to partition)
Dynamic change configuration, stop worker, and begin with new configuration from the offset
Should notify if the cluster is in heavy use
Show launch new instances if needed.
Should shutdown instances if able to
Should be able to register a worker for a specific topic (For example if some topic is already being send to s3, we should be able to register a node and ask for the same topic so the worker can save it on cassandra for example)
The master should be able to recover from a crash
Where to save the configuration? Every master with a liteSql and duplicate?

Workers:

should be able to receive the messages from a topic, and save it to s3 (future (cassandra, elastic search))
Should receive from the master how to partition the data, and how to commit
Should be able to tell the master that the lag is getting higher, so the master can try to set up a new node
Order file by some atribute

Inspiration:

Secor
Bifrost
Camus

ROADMAP:

Version 0.1 -> master finding topics, registering workers, ask workers to process topic, save file with size X, compress and send to s3

start Master start Worker start Singleton

In SBT, just run docker:publishLocal to create a local docker container.

To launch the first node, which will be the seed node:

$ docker run -i -t --rm --name seed kuhnen/processor:0.1

To add a member to the cluster:

$ docker run --rm --name c1 --link seed:seed -i -t kuhnen/processor:0.1

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
project		project
src		src
.gitignore		.gitignore
README.md		README.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Auto scalable (AWS) cluster to work with topics from kafka cluster:

Working project.....

Master responsabilities:

Workers:

Inspiration:

ROADMAP:

About

Releases

Packages

Languages

kuhnen/akka-kafka-processor

Folders and files

Latest commit

History

Repository files navigation

Auto scalable (AWS) cluster to work with topics from kafka cluster:

Working project.....

Master responsabilities:

Workers:

Inspiration:

ROADMAP:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages