Learn to design data models, build data warehouses and data lakes, automate data pipelines, and work with massive datasets.
- Create user-friendly relational and NoSQL data models
- Create scalable and efficient data warehouses
- Work efficiently with massive datasets
- Build and interact with a cloud-based data lake
- Automate and monitor data pipelines
- Develop proficiency in Spark, Airflow, and AWS tools
Learn to create relational and NoSQL data models to fit the diverse needs of data consumers. Use ETL to build databases in PostgreSQL and Apache Cassandra.
- Introduction to Data Modeling
- Relational Data Models
- NoSQL Data Models
- Data Modeling with Postgres
- Data Modeling with Apache Cassandra
Learn to create cloud-based data warehouses. Sharpen your data warehousing skills, deepen your understanding of data infrastructure, and be introduced to data engineering on the cloud using Amazon Web Services (AWS).
- Introduction to the Data Warehouses
- Introduction to the Cloud with AWS
- Implementing Data Warehouses on AWS
- Build a Cloud Data Warehouse
Learn more about the big data ecosystem and how to use Spark to work with massive datasets. Learn about how to store big data in a data lake and query it with Spark.
- The Power of Spark
- Data Wrangling with Spark
- Debugging and Optimization
- Introduction to Data Lake
- Build a Data Lake
Learn to schedule, automate, and monitor data pipelines using Apache Airflow. Learn to run data quality checks, track data lineage, and work with data pipelines in production.
- Data Pipelines
- Data Quality
- Production Data Pipelines
- Data Pipelines with Airflow
Combine all the skills throughout the program to build your own data engineering portfolio project.
- Data Engineer Capstone
Thanks to Amreesh for this learning experience