Skip to content

End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline using Spark, Spark SQL, Spark ML, GraphX, Spark Streaming, Kafka, NiFi, Cassandra, ElasticSearch, Redis, Tachyon, HDFS, Zeppelin, iPython/Jupyter Notebook, Tableau, Twitter Algebird. See https://github.com/fluxcapacitor/pipeline/wiki for Setup Instructions.

License

Notifications You must be signed in to change notification settings

MakoXiao/pipeline

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Docker-based, End-to-End, Big Data Reference Pipeline!

Real-time, Advanced Analytics, Machine Learning, Streaming, Graph Processing, Text/NLP Analytics

Follow Wiki Sidebar to Setup Environment -->

Apache Zeppelin Notebooks

Apache Zeppelin Notebooks

Jupyter/iPython Notebooks

Jupyter/iPython Notebooks

Apache NiFi Flows

Apache NiFi Flows

Tableau Integration

Tableau Integration

Beeline Command-line Hive Client

Beeline Command-line Hive Client

Log Visualization with Kibana & Logstash

Log Visualization with Kibana & Logstash

Spark, Spark Streaming, and Spark SQL Admin UIs

Spark Admin UI Spark Admin UI Spark Admin UI Spark Admin UI Spark Admin UI Spark Admin UI

Ganglia System and JVM Metrics Monitoring UIs

Ganglia Metrics UI Ganglia Metrics UI Ganglia Metrics UI Ganglia Metrics UI Ganglia Metrics UI

Architecture Overview

Big Data Pipeline Overview

Tools Overview

Apache Spark Redis Apache Cassandra Apache Kafka NiFi ElasticSearch Logstash Kibana Apache Zeppelin Ganglia Hadoop HDFS iPython Notebook Docker Tachyon

About

End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline using Spark, Spark SQL, Spark ML, GraphX, Spark Streaming, Kafka, NiFi, Cassandra, ElasticSearch, Redis, Tachyon, HDFS, Zeppelin, iPython/Jupyter Notebook, Tableau, Twitter Algebird. See https://github.com/fluxcapacitor/pipeline/wiki for Setup Instructions.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 83.6%
  • Python 5.8%
  • Shell 2.9%
  • Scala 2.1%
  • C++ 1.5%
  • Java 1.5%
  • Other 2.6%