Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
-
Updated
Aug 26, 2022 - Python
Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
One framework to develop, deploy and operate data workflows with Python and SQL.
Data pipeline performing ETL to AWS Redshift using Spark, orchestrated with Apache Airflow
Project demonstrating how to automate Prefect 2.0 deployments to AWS ECS Fargate
Code examples showing flow deployment to various types of infrastructure
Classwork projects and home works done through Udacity data engineering nano degree
Let your pipe lines flow thru the Python code in xonsh.
Deploy a Prefect flow to serverless AWS Lambda function
Data Engineering Project with Hadoop HDFS and Kafka
Apache Spark Guide
A end-to-end real-time stock market data pipeline with Python, AWS EC2, Apache Kafka, and Cassandra Data is processed on AWS EC2 with Apache Kafka and stored in a local Cassandra database.
Analysis of 311 Service Requests for the City of NYC (from 2010 to 2023) Tech: Prefect cloud, dbt core, BigQuery, Compute Engine, CloudRun, Artifact Registry, Terraform, Docker
💜🌈📊 A Data Engineering Project that implements an ETL data pipeline using Dagster, Apache Spark, Streamlit, MinIO, Metabase, Dbt, Polars, Docker. Data from kaggle and youtube-api 🌺
Learning from multiple companies in Silicon Valley. Netflix, Facebook, Google, Startups
ETL pipeline combined with supervised learning and grid search to classify text messages sent during a disaster event
Challenge to job: Data Scientist
An end-to-end Twitter Data Pipeline that extracts data from Twitter and loads it into AWS S3.
Add a description, image, and links to the data-engineering-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the data-engineering-pipeline topic, visit your repo's landing page and select "manage topics."