During this workshop, we will dive into Advanced Pandas and explore:
- Date & time (converting to datetime, error handling, data analysis)
- Loading data (schemas, encoding, performance)
- Group By: split-apply-combine operations
- Python 3.5+
- pandas 0.25.3
- jupyter lab 1.2.3
- pyenv (optional)
Using virtual environment:
- Download and install pyenv
- Clone the repository
- pyenv install 3.6.8 (or any python version compatible)
- pyenv shell 3.6.8 (specify which python to use)
- pyenv -m venv venv (create a virtual environment)
- source venv/bin/activate (activate the virtual env)
- pip install pandas==0.25.3 jupyterLab==1.2.3
- Start jupyterLab and navigate to the workshop folder
Alternative using online notebook colab:
- requires a google account
- Connect to this website : https://colab.research.google.com/
- Click on File
- Click Open a notebook
- Under Github tab enter this URL : https://github.com/pyladiesams/Pandas-advanced-nov2019
- Press enter this will sync and show all notebooks available
- Choose workshop/X.ipynb
- Cloning the repo and installation the requirements are done through the notebook. The first time, it will take some time to spin up a machine !
Important : these steps need to be performed for every notebook since it spins up a fresh machine everytime If you want to save your work, you can either save in your Google Drive or right click and download on your local machine
This workshop was set up by @pyladiesams, Cheryl Zandvliet (https://github.com/verycherry) and Cindy Cressot (https://github.com/cindy-cressot)