Skip to content

Pengerjaan Capstone Project Recything Divisi Data Engineer

Notifications You must be signed in to change notification settings

RECYTHNG/recything-de

Repository files navigation

Project Name

ETL Pipelines and Data Visualization for RecyThing user activity and environmental data

About Project

The RecyThing project focuses on optimizing and visualizing user activity and environmental data through an efficient ETL (Extract, Transform, Load) process. By extracting data from databases, transforming it into a consistent, analyzable format, and loading it into a centralized database, we ensure data accuracy and accessibility. Our goal is to leverage this processed data to create in-depth visualizations that highlight user behavior and environmental data that support sustainable practices and informed decision-making.

Tech Stacks

Tools:

  • Visual Studio Code
  • Jupyter Notebook
  • Python
  • Github
  • Google Cloud Storage
  • Big Query
  • Looker Studio
  • Drawio
  • Figma
  • Apache Airflow

Frameworks:

  • import os
  • import pandas as pd
  • from dotenv import load_dotenv
  • import mysql.connector
  • from google.cloud import storage
  • from google.cloud import bigquery
  • from google.cloud.exceptions import NotFound

Architecture Diagram

image

Schema Data Warehouse

image Schema Data Warehouse

Dashboard Visualization

image image image Dashboard Visualization Data Engineer

Setup

  • Clone repository proyek dari GitHub.
  • Masuk ke dalam direktori proyek dan instal semua dependensi yang diperlukan.
  • Ubah file konfigurasi (berupa file .env) untuk mencocokkan pengaturan lokal, termasuk konfigurasi database dan target load.
  • Menjalankan proyek pipelines ETL
  • Pantau proses ETL saat berjalan dan pastikan untuk menangani semua kesalahan atau masalah yang mungkin muncul.
  • Pastikan bahwa data yang diekstrak, ditransformasi, dan dimuat (ETL) sesuai dengan yang diharapkan dengan menjalankan tes yang sesuai.
  • Melakukan optimasi kinerja atau penyetelan lebih lanjut untuk meningkatkan kinerja atau keandalan (seperti otomasi dengan airflow).

About

Pengerjaan Capstone Project Recything Divisi Data Engineer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published