Bank Churn Prediction using MLflow and Flask

This project is an end-to-end machine learning system for predicting bank customer churn based on a set of key features such as age, tenure, active membership, and credit score. It uses a pipeline for data ingestion, validation, transformation, model training, and evaluation. The model is deployed via a Flask web application that allows for real-time predictions.

Project Overview

This system predicts whether a customer will churn based on various features like:

Age
Tenure
IsActiveMember status
Gender
Geography
CreditScore
NumOfProducts

The dataset is highly imbalanced, so SMOTE (Synthetic Minority Over-sampling Technique) was applied to handle the imbalance. Multiple machine learning algorithms were tested, and Gradient Boosting Classifier was found to be the best-performing model based on accuracy, precision, recall, and F1-score.

The project is structured as a modular pipeline and includes data ingestion, validation, feature engineering, model training, and evaluation.

Features

Data Ingestion: Handles reading and loading data for training and testing.
Data Validation: Ensures data quality and consistency.
Data Transformation: Preprocessing and feature engineering using techniques like scaling, encoding, etc.
Model Training: A range of models are tested, and Gradient Boosting Classifier is used for final predictions.
Model Evaluation: Metrics like accuracy, precision, recall, and F1-score are calculated.
Flask Web App: Provides two URLs: one for the default route and another for making predictions.

Technologies Used

Python (for building the pipeline and machine learning models)
Flask (for deploying the model as a web app)
MLflow (for tracking model metrics and artifacts)
Pandas (for data manipulation)
Scikit-learn (for model building and evaluation)
SMOTE (for handling imbalanced data)
Conda (for environment management)

Installation

Clone the repository:

git clone https://github.com/Debopam-Pritam2014/customer-churn.git
cd customer-churn

Create and activate the Conda environment:

conda create --name churn_prediction python=3.10
conda activate churn_prediction

Install the required dependencies:
```
pip install -r requirements.txt
```

Running the Application

Run the Flask app:
```
python app.py
```
The app will start on http://127.0.0.1:5000/.

Flask URLs:

Default route (/): For testing the app.
Prediction route (/predict): Accepts input features for making predictions.

Usage

Once the application is running, you can access the following:

Default route: http://127.0.0.1:5000/
Prediction route: http://127.0.0.1:5000/predict

You can use the /predict route to input features like age, tenure, creditscore, etc., and get back whether the customer is predicted to churn or not.

Model Evaluation

The model was evaluated using various metrics:

Accuracy : 86%
Precision : 66%
Recall : 62%
F1-score : 60%

The Gradient Boosting Classifier was found to be the best-performing model for predicting churn.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
artifacts		artifacts
data		data
data_validation		data_validation
experiments		experiments
src		src
templates		templates
.gitignore		.gitignore
README.md		README.md
app.py		app.py
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bank Churn Prediction using MLflow and Flask

Table of Contents

Project Overview

Features

Technologies Used

Installation

Running the Application

Flask URLs:

Usage

Model Evaluation

About

Releases

Packages

Languages

Debopam-Pritam2014/customer-churn

Folders and files

Latest commit

History

Repository files navigation

Bank Churn Prediction using MLflow and Flask

Table of Contents

Project Overview

Features

Technologies Used

Installation

Running the Application

Flask URLs:

Usage

Model Evaluation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages