Skip to content

This repository contains all the scripts used for the Loan Approval Kaggle competition. I utilized MLflow to track experiments, save models, and identify the best-performing model based on cross-validation scores. FastAPI APP for making predictions.

License

Notifications You must be signed in to change notification settings

Jatin-Mehra119/Loan-Approval

Repository files navigation

Loan Approval Prediction

Overview

Welcome to the 2024 Kaggle Playground Series! This project is part of the Kaggle competition where the goal is to predict whether an applicant is approved for a loan. The dataset provided is ideal for practicing machine learning skills and involves various preprocessing, training, and evaluation steps.

Data Source:Kaggle

Project Structure

Loan-Approval/
├── data/
│   ├── raw/
│   │   └── train.csv
│   ├── cleaned/
│   │   └── train.csv
├── models/all the model # Didn't save 100mb-400mb+ Size 
├── src/
│   ├── clean_data.py
│   ├── evaluate.py
│   ├── preprocessing.py
│   └── trainer.py
├── RandomF.ipynb
├── notebook/
│   └── notebook.ipynb # EDA & DATA visualizations
├── XGB.ipynb
├── Stacking_clf.ipynb
├── catboost.ipynb
├── lightgb.ipynb
├── requirements.txt
├── app.py # fastAPI app to predict the loan status
├── app.log # Logs generated by fastapi (app.py)
├── Dockerfile # Dockerize fastapi app
└── README.md

Getting Started

Prerequisites

  • Python 3.12.3
  • Jupyter Notebook
  • Required Python packages (listed in requirements.txt)

Installation

  1. Clone the repository:

    git clone https://github.com/Jatin-Mehra119/Loan-Approval.git
    cd Loan-Approval
    
  2. Install the required packages:

    pip install -r requirements.txt
    

Data Preparation

  1. Place the raw data file train.csv in the data/raw/ directory.

  2. Run the data cleaning script to preprocess the data:

    python src/clean_data.py
    

    This will create a cleaned version of the data in the data/cleaned/ directory.

Notebooks

  • RandomF.ipynb: Train and tune the hyperparameters, evaluate a Random Forest classifier.
  • XGB.ipynb: Train and tune the hyperparameters, evaluate an XGBoost classifier.
  • Stacking_clf.ipynb: Train and tune the hyperparameters, evaluate and tune the hyperparameters, a Bagging classifier.
  • catboost.ipynb: Train and tune the hyperparameters, evaluate a CatBoost classifier.

Source Code

src/preprocessing.py

Contains the preprocessing class that handles data preprocessing steps including imputation, scaling, and encoding.

src/trainer.py

Defines the Trainer class which handles model training, hyperparameter tuning, evaluation, and saving.

src/evaluate.py

Defines the Evaluator class which evaluates the trained models and logs metrics.

src/clean_data.py

Loads raw data, applies preprocessing, and saves the cleaned data.

FastAPI Application (app.py)

Overview

This project also includes a FastAPI application that serves the trained loan approval model as an API. The API allows you to send loan applicant data in JSON format and receive a prediction indicating whether the loan is approved or not.

FastAPI Endpoint

  • POST /predict: This endpoint accepts a JSON body containing the applicant's loan data and returns the loan approval prediction.

API Example

You can test the FastAPI app using Postman or cURL. Here's an example of how to format the input data:

Request Body (JSON)

{
    "id": 58652,
    "person_age": 23,
    "person_income": 55000,
    "person_home_ownership": "MORTGAGE",
    "person_emp_length": 6.0,
    "loan_intent": "PERSONAL",
    "loan_grade": "A",
    "loan_amnt": 6250,
    "loan_int_rate": 6.76,
    "loan_percent_income": 0.12,
    "cb_person_default_on_file": "N",
    "cb_person_cred_hist_length": 2
}

Response (JSON)

{
    "prediction": 1
}

Where:

  • 1 indicates the loan is approved.
  • 0 indicates the loan is denied.

Running the FastAPI App

To run the FastAPI app, use the following command:

uvicorn app:app --reload 

This will start the API server at http://localhost:8000. You can then access the /predict endpoint to test predictions.

Logs

The FastAPI application logs each incoming request, and these logs are saved locally for reference. The logs include details such as the timestamp, the request data, and the corresponding prediction result.

Error Handling

The FastAPI application also handles errors gracefully, returning a message describing any issues encountered during the prediction process.

Usage

  1. Open the desired Jupyter notebook (e.g., RandomF.ipynb) and follow the steps to train and evaluate the model.
  2. The models will be saved in the models/ directory.
  3. Evaluate the models using the provided evaluation scripts.
  4. Run the FastAPI app and interact with it using Postman or a similar tool.

Final Model Submission

For the final submission, the Stacking Classifier was selected and used to generate the predictions. This model achieved the highest performance and was used for the final submission.csv.

Performance Metrics

Here are the ROC_AUC scores (Cross-validation scores) for each model:

  • XGBoost: ~95
  • CatBoost: ~95
  • LightGBM: ~95
  • Random Forest: ~93
  • Stacking Classifier: 96 (Best performing model for final submission)

Contributions

Contributions are welcome! Feel free to open a pull request or create an issue if you find any bugs or have suggestions for improvements.

License

This project is licensed under the MIT License.

Key Changes:

  • FastAPI Overview: Added a new section that explains the FastAPI application for serving predictions.
  • API Example: Added an example of the input JSON and the expected output JSON.
  • Running FastAPI: Instructions on how to run the FastAPI app locally.
  • Logs and Error Handling: Mentioned that incoming requests are logged and handled with error messages if necessary.

About

This repository contains all the scripts used for the Loan Approval Kaggle competition. I utilized MLflow to track experiments, save models, and identify the best-performing model based on cross-validation scores. FastAPI APP for making predictions.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages