Skip to content

harisonmg/data-science-with-python

Repository files navigation

Data Science with Python

Objectives

By the end of the course you should:

  • Be able to perform basic data analysis using Python
  • Have basic understanding of the machine learning process
  • Know how to share reproducible results from data analysis
  • Have gained familiarity with some important topics in data such as data versioning

Prerequisites

  • An appetite for learning
  • Commitment to pursue your learning goals

Course outline

Python fundamentals

  • Data types
  • Operators
  • Variables
  • Introduction to functions
  • Reading and writing files
  • Error handling
  • Imports
  • Iteration
  • Flow control

Data visualisation

  • Review of graphical EDA techniques
    • Histogram
    • Box plot
    • Scatterplot
    • Line plot
    • Bar plot
  • Reading data from flat files with pandas
  • Automated EDA with pandas-profiling
  • Creating static plots with seaborn and matplotlib

Data analysis

  • Aggregating data
  • Merging data
  • Data cleaning

Introduction to machine learning

  • The machine learning landscape
  • Supervised learning with tabular data
    • Regression analysis
    • Classification

Miscellaneous

  • Communicating results with Quarto
  • Tracking machine learning experiments with MLflow
  • Version control with Git and GitHub
  • Data versioning with DVC

Capstone project