Loan Approval Prediction

Overview

Welcome to the 2024 Kaggle Playground Series! This project is part of the Kaggle competition where the goal is to predict whether an applicant is approved for a loan. The dataset provided is ideal for practicing machine learning skills and involves various preprocessing, training, and evaluation steps.

Data Source:Kaggle

Project Structure

Loan-Approval/
├── data/
│   ├── raw/
│   │   └── train.csv
│   ├── cleaned/
│   │   └── train.csv
├── models/all the model # Didn't save 100mb-400mb+ Size 
├── src/
│   ├── clean_data.py
│   ├── evaluate.py
│   ├── preprocessing.py
│   └── trainer.py
├── RandomF.ipynb
├── notebook/
│   └── notebook.ipynb # EDA & DATA visualizations
├── XGB.ipynb
├── Stacking_clf.ipynb
├── catboost.ipynb
├── lightgb.ipynb
├── requirements.txt
├── app.py # fastAPI app to predict the loan status
├── app.log # Logs generated by fastapi (app.py)
├── Dockerfile # Dockerize fastapi app
└── README.md

Getting Started

Prerequisites

Python 3.12.3
Jupyter Notebook
Required Python packages (listed in requirements.txt)

Installation

Clone the repository:

git clone https://github.com/Jatin-Mehra119/Loan-Approval.git
cd Loan-Approval

Install the required packages:
```
pip install -r requirements.txt
```

Data Preparation

Place the raw data file train.csv in the data/raw/ directory.
Run the data cleaning script to preprocess the data:
```
python src/clean_data.py
```
This will create a cleaned version of the data in the data/cleaned/ directory.

Notebooks

RandomF.ipynb: Train and tune the hyperparameters, evaluate a Random Forest classifier.
XGB.ipynb: Train and tune the hyperparameters, evaluate an XGBoost classifier.
Stacking_clf.ipynb: Train and tune the hyperparameters, evaluate and tune the hyperparameters, a Bagging classifier.
catboost.ipynb: Train and tune the hyperparameters, evaluate a CatBoost classifier.

Source Code

`src/preprocessing.py`

Contains the preprocessing class that handles data preprocessing steps including imputation, scaling, and encoding.

`src/trainer.py`

Defines the Trainer class which handles model training, hyperparameter tuning, evaluation, and saving.

`src/evaluate.py`

Defines the Evaluator class which evaluates the trained models and logs metrics.

`src/clean_data.py`

Loads raw data, applies preprocessing, and saves the cleaned data.

FastAPI Application (app.py)

Overview

This project also includes a FastAPI application that serves the trained loan approval model as an API. The API allows you to send loan applicant data in JSON format and receive a prediction indicating whether the loan is approved or not.

FastAPI Endpoint

POST /predict: This endpoint accepts a JSON body containing the applicant's loan data and returns the loan approval prediction.

API Example

You can test the FastAPI app using Postman or cURL. Here's an example of how to format the input data:

Request Body (JSON)

{
    "id": 58652,
    "person_age": 23,
    "person_income": 55000,
    "person_home_ownership": "MORTGAGE",
    "person_emp_length": 6.0,
    "loan_intent": "PERSONAL",
    "loan_grade": "A",
    "loan_amnt": 6250,
    "loan_int_rate": 6.76,
    "loan_percent_income": 0.12,
    "cb_person_default_on_file": "N",
    "cb_person_cred_hist_length": 2
}

Response (JSON)

{
    "prediction": 1
}

Where:

1 indicates the loan is approved.
0 indicates the loan is denied.

Running the FastAPI App

To run the FastAPI app, use the following command:

uvicorn app:app --reload

This will start the API server at http://localhost:8000. You can then access the /predict endpoint to test predictions.

Logs

The FastAPI application logs each incoming request, and these logs are saved locally for reference. The logs include details such as the timestamp, the request data, and the corresponding prediction result.

Error Handling

The FastAPI application also handles errors gracefully, returning a message describing any issues encountered during the prediction process.

Usage

Open the desired Jupyter notebook (e.g., RandomF.ipynb) and follow the steps to train and evaluate the model.
The models will be saved in the models/ directory.
Evaluate the models using the provided evaluation scripts.
Run the FastAPI app and interact with it using Postman or a similar tool.

Final Model Submission

For the final submission, the Stacking Classifier was selected and used to generate the predictions. This model achieved the highest performance and was used for the final submission.csv.

Performance Metrics

Here are the ROC_AUC scores (Cross-validation scores) for each model:

XGBoost: ~95
CatBoost: ~95
LightGBM: ~95
Random Forest: ~93
Stacking Classifier: 96 (Best performing model for final submission)

Contributions

Contributions are welcome! Feel free to open a pull request or create an issue if you find any bugs or have suggestions for improvements.

License

This project is licensed under the MIT License.

Key Changes:

FastAPI Overview: Added a new section that explains the FastAPI application for serving predictions.
API Example: Added an example of the input JSON and the expected output JSON.
Running FastAPI: Instructions on how to run the FastAPI app locally.
Logs and Error Handling: Mentioned that incoming requests are logged and handled with error messages if necessary.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Loan Approval Prediction

Overview

Data Source:Kaggle

Project Structure

Getting Started

Prerequisites

Installation

Data Preparation

Notebooks

Source Code

`src/preprocessing.py`

`src/trainer.py`

`src/evaluate.py`

`src/clean_data.py`

FastAPI Application (app.py)

Overview

FastAPI Endpoint

API Example

Request Body (JSON)

Response (JSON)

Running the FastAPI App

Logs

Error Handling

Usage

Final Model Submission

Performance Metrics

Contributions

License

Key Changes:

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
catboost_info		catboost_info
mlartifacts/0		mlartifacts/0
mlruns/0		mlruns/0
models		models
notebook		notebook
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
RandomF.ipynb		RandomF.ipynb
Stacking_clf.ipynb		Stacking_clf.ipynb
XGB.ipynb		XGB.ipynb
app.log		app.log
app.py		app.py
catboost.ipynb		catboost.ipynb
lightgb.ipynb		lightgb.ipynb
requirements.txt		requirements.txt
submit_test.py		submit_test.py

License

Jatin-Mehra119/Loan-Approval

Folders and files

Latest commit

History

Repository files navigation

Loan Approval Prediction

Overview

Data Source:Kaggle

Project Structure

Getting Started

Prerequisites

Installation

Data Preparation

Notebooks

Source Code

src/preprocessing.py

src/trainer.py

src/evaluate.py

src/clean_data.py

FastAPI Application (app.py)

Overview

FastAPI Endpoint

API Example

Request Body (JSON)

Response (JSON)

Running the FastAPI App

Logs

Error Handling

Usage

Final Model Submission

Performance Metrics

Contributions

License

Key Changes:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`src/preprocessing.py`

`src/trainer.py`

`src/evaluate.py`

`src/clean_data.py`

Packages