Email: rrd6@rice.edu
- (Required) Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition, by Aurélien Géron
- (Recommended Supplemental Reading) Machine Learning with Python for Everyone, Addison Wesley Data & Analytics Series, 2020 Pearson Education, by Mark E. Fenner
This data science course covers algorithms from supervised learning, unsupervised learning, and if time permits, reinforcement learning. We will implement many of the machine learning algorithms from scratch in python, but will also make use of, Scikit-Learn, Keras, and Tensorflow. Topics include, but are not limited to:
-
Data Science Practices:
- Python Programming
- Jupyter Notebooks
- Visual Studio Code
- Version Control with Git and Github
- Data Visualization
-
Supervised Learning:
- Model Building and Error Analysis
- Linear Regression
- Gradient Descent
- Logistic Regression
- Neural Nets
- Support Vector Machines
- k-Nearest Neighbors
- Decision/ Regression Trees
- Ensemble Learning
-
Unsupervised Learning:
- k-Means Clustering
- Principle Component Analysis
-
Reinforcement Learning:
- Tabular versus Deep Learning Methods
This course relies heavily on programming, and the programming language of choice for many data scientists is python. As such, this course will be taught using python, versions 3.6 and higher.
Each student will build a Github repository throughout the semester and this repository will be the only grading criteria for this course. The students repository will have sub-repositories which will contain data sets, descriptive README.md files, and Jupyter notebooks illustrating implementations of the algorithms covered in this course. At the end of the semester the instructor will view each students GitHub repository and assign course grades according to the following criteria:
- [70 points] Each algorithm covered in the course is successfully implemented on a dataset not used by the instructor during lecture.
- [10 points] Performance and error analysis is conducted on each algorithm.
- [10 points] Jupiter notebooks are clear and clean with code that has been written in a professional “pythonic” manor.
- [10 points] README files are included in each subdirectory clearly explaining the algorithm being implemented as well as explaining the data that it is being used.
Course grades will be assigned in the typical manor, 100 - 90 points (A), 89 - 80 points (B), 79 - 70 points (C), 69-60 points (D), 59 - 0 (F).