This is a collection of ML projects that I did, topics include:
- Regression: Linear Regression, Lasso, Ridge
- Time Series: ARIMA, SARIMA (Box-Jenkins Approach), Exponential Smoothing (Holt-Winters Approach), VAR, VARX (Vector Autoregression Approach) etc
- Recommender System: collaborative filtering and matrix factorization
- NLP: embeddings, Tf-idf etc
- XGBoost, Random Forest implementation
- Neural Network: Image classification
Use Random Forest with feature engineering to predict Click-Through Rate (CTR) with Avazu data. Final result with Log Loss ≈ 0.4.
Use Adaboost and XGBoost methods to predict if a email is a spam with 97% accuracy rate.
Pytorch: implement simple 2-Layer and 3-Layer neural network using MNIST dataset to predict hand written digit with an accuracy rate of 98.29%
Use Glove embeddings on movie review dataset of 50,000 reviews from IMDB. Predict if a review is positive or negative given the content: use XGBoost to achieve an 86.7% accuracy rate.
Use Linear Regression to predict the house prices in Ames, Iowa. Compare regression models of OLS, Ridge, Lasso and Elastic Net techniques and generate a business report.
Replicate the recommendation system on blog-based website: provide 5 articles based on what the user is reading now using word2vec on data.
Build a movie rating recommendation system from scratch using collaborative filtering with matrix factorization.
I know, I know, I know. You've seen this project a million times, as a ML student, I just had to do it like everyone else :)
Part of the code is modified from Prof. Yannet Interian's USF Advanced ML class.
Part of the code is modified from Prof. Terence Parr's USF data acquisition class..