Skip to content

Latest commit

 

History

History
112 lines (71 loc) · 3.52 KB

README.md

File metadata and controls

112 lines (71 loc) · 3.52 KB

Credit Score Prediction Project

Created by: Jake Phelan & Daniel Sachs

https://github.com/jakephelan1

https://github.com/d3buggerdan

A1229095-BA09-41B6-AD63-AEA42C0ED57B.online-video-cutter.com.mp4

Project Overview

This project is designed to train various machine learning models on a credit score database containing different information about individuals and their respective credit scores. It includes a GUI with a Flask backend to use the trained models for predicting a credit score based on user inputs. Additionally, it outputs a pie chart indicating each model's confidence in their predictions.

Data Preperation

We began with a messy dataset containing over 70,000 inputs. The initial data required extensive cleaning and management to make it suitable for training. This involved handling missing values, normalizing data, and transforming categorical variables. Through a rigorous data preprocessing phase, we created a new, clean dataset, which significantly improved the performance of our machine learning models.

Features

Trains multiple machine learning models on credit score data Predicts credit scores using a Flask-based web interface Displays model confidence using pie charts Implements feature engineering and data preprocessing Utilizes SMOTE for handling class imbalance Supports hyperparameter tuning using Optuna and Keras Tuner Provides logging for monitoring training and data processing

Tools Used

Python: Programming language for the backend and model training

Flask: Micro web framework for the web application

Pandas: Data manipulation and analysis

NumPy: Numerical computing

Scikit-Learn: Machine learning library

Imbalanced-learn: Handling imbalanced datasets

Keras and TensorFlow: Deep learning library

Optuna: Hyperparameter optimization framework

Joblib: Serialization of models and other objects

Matplotlib: Plotting library for visualizations

FancyImpute: Advanced imputation of missing values

Setup and Installation

To set up and run this project locally, follow these steps. Note that steps 5 and 6 are optional and are included for model training reproducibility. If you only want to test the app, you can skip steps 5 and 6 as the models are already trained and saved in trained_models.

  1. Clone the Repository
git clone https://github.com/jakephelan1/credit-genie.git
cd credit-genie
  1. Create a Virtual Environment
python3 -m venv myenv
source myenv/bin/activate
  1. Install Dependencies

Ensure you have Git LFS installed and configured:

brew install git-lfs
git lfs install

Then install the Python dependencies:

pip install -r requirements.txt
  1. Download Large Files

If you have pushed large files using Git LFS, download them:

git lfs pull
  1. Set Up the Database and Data Files

Place your dataset in the CSV/ directory with the name dataset.csv.

  1. Run Data Preprocessing and Model Training
python filter_dataset.py
python train_models.py
  1. Start the Flask Application
python application.py

Usage

Once the Flask application is running, open your web browser and go to http://127.0.0.1:5000/. You will see a form where you can input various personal and financial details. Upon submission, the application will predict your credit score and display a pie chart showing each model's confidence.

Note

This project is for educational purposes only and is not gauranteed to be 100% accurate.