Skip to content

beimingliu/AdvancedMachineLearning

Repository files navigation

Advanced Machine Learning Related Projects

This is a collection of ML projects that I did, topics include:

  • Regression: Linear Regression, Lasso, Ridge
  • Time Series: ARIMA, SARIMA (Box-Jenkins Approach), Exponential Smoothing (Holt-Winters Approach), VAR, VARX (Vector Autoregression Approach) etc
  • Recommender System: collaborative filtering and matrix factorization
  • NLP: embeddings, Tf-idf etc
  • XGBoost, Random Forest implementation
  • Neural Network: Image classification

Use Random Forest with feature engineering to predict Click-Through Rate (CTR) with Avazu data. Final result with Log Loss ≈ 0.4.

Use Adaboost and XGBoost methods to predict if a email is a spam with 97% accuracy rate.

Pytorch: implement simple 2-Layer and 3-Layer neural network using MNIST dataset to predict hand written digit with an accuracy rate of 98.29%

Use Glove embeddings on movie review dataset of 50,000 reviews from IMDB. Predict if a review is positive or negative given the content: use XGBoost to achieve an 86.7% accuracy rate.

Use Linear Regression to predict the house prices in Ames, Iowa. Compare regression models of OLS, Ridge, Lasso and Elastic Net techniques and generate a business report.

Replicate the recommendation system on blog-based website: provide 5 articles based on what the user is reading now using word2vec on data.

Build a movie rating recommendation system from scratch using collaborative filtering with matrix factorization.

I know, I know, I know. You've seen this project a million times, as a ML student, I just had to do it like everyone else :)


Part of the code is modified from Prof. Yannet Interian's USF Advanced ML class.
Part of the code is modified from Prof. Terence Parr's USF data acquisition class..