This project focuses on using machine learning algorithms to estimate car prices. Various regression algorithms were implemented, including:
- Linear Regression
- Lasso Regression
- Ridge Regression
- Decision Tree
- Random Forest
- XGBoost
Model evaluation, grid-search, and cross-validation were performed, resulting in the following scores:
Model | R2 | MAE | RMSE | MAPE |
---|---|---|---|---|
XGBoost | 0.921 | 2123.94 | 3373.07 | 0.132 |
Random Forest | 0.921 | 2252.57 | 3374.97 | 0.150 |
Lasso | 0.831 | 2818.00 | 4954.25 | 0.192 |
Linear Regression | 0.830 | 2818.65 | 4957.25 | 0.192 |
ElasticNet | 0.830 | 2817.18 | 4959.12 | 0.192 |
Decision Tree | 0.816 | 3467.44 | 5157.75 | 0.221 |
The final models chosen were Random Forest and XGBoost. Feature importance was determined separately for each model to reduce feature counts. The models were saved using Pickle and converted into a Streamlit file for deployment outside the notebook environment. The Streamlit file was published on both AWS EC2 instances and the Streamlit website, enabling users to make predictions interactively.