SmartMailGuard

About the Project

Aim

The objective of this project is to develop an intelligent email classification system using machine learning and deep learning models.

Description

SmartMailGuard is a system designed to categorize emails using Naïve Bayes, LSTM, and other Transformer architectures.

Using these different models and algorithms we can compare and grade their effectiveness on datasets of varying sizes and on the type of classification: Binary (Spam/Not-Spam) or Multiclass.

Tech Stack

Models and Accuracies

83k Dataset Link(For Binary Classification): Kaggle
3k Dataset Link(For Multiclass Classification): Kaggle
Dataset for AutoLabeler: Kaggle

1. Naïve Bayes

1.1. Without N-gram Optimization

Train:
Test:

1.2. With N-gram Optimization

Train:
Test:

Toy Examples:

2. Recurrent Neural Network (RNN)

Train:
Test:

3. Long Short-Term Memory (LSTM)

Train:
Test:

4. Multinomial Naïve Bayes

5. Bidirectional Encoder Representations from Transformers (BERT)

5.1. From Scratch

Train:
Test:

5.2. Implementation from a Pre-Trained Model

Train:
Test:

Toy Examples

6. Support Vector Machine (SVM)

Train:
Test:

7. Decision Tree

8. Random Forest Classififer

Train:
Test:

File Structure

├── Binary Classification
│   ├── Naive_Bayes_Final.ipynb
│   ├── Naive_Bayes_enron_dataset.ipynb
│   ├── Naive_Bayes_sklearn.ipynb
│   ├── lstmemailclassification.ipynb
│   └── RNN_spam_not_spam.ipynb
├── Coursera Notes
│   ├── Course1
│   ├── Course2
│   ├── Course5
├── Multi Intent Classification
│   ├── Decision Tree
|   │   ├── decision-tree-grid-search.ipynb
|   │   ├── decision-tree.ipynb
│   ├── Random Forest Classifier
|   │   ├── RandomForestClassifier-grid_search.ipynb
|   │   ├── RandomForestClassifier.ipynb
│   ├── Support Vector Machine
|   │   ├── SVM_grid_search.ipynb
|   │   ├── SVM_multiclass_classifier.ipynb
|   ├── AutoLabeler.ipynb
│   ├── Multiclass.ipynb
│   ├── multiclass-bert-Finaldataset.ipynb
│   ├── multiclass-bert-Finaldataset-from-scratch.ipynb
│   └── multinomial_combined.ipynb
├── SmartMailGuard Report
│   ├── SmartMailGuard Report.pdf
└── README.md

Requirements

Install Python 3.1.
Install Pip and verify its installation using the following terminal command:

pip --version

Optional: Install Jupyter using the following command:

pip install jupyter lab

Alternatively, Google Colaboratory and Kaggle can also be used to run the notebooks (with some RAM limitations).

Run the following command to install all the dependencies:

pip install pandas pytorch scikit-learn tensorflow transformers

Clone the repository:

git clone https://github.com/aitwehrrg/SmartMailGuard.git

Run any of the models (.ipynb) as Jupyter notebooks.

Contributors

Mentors

Acknowledgements and Resources

CoC and Project X for providing this opportunity.
Course on Deep Learning Specialization by DeepLearning.AI
Long Short-Term Memory
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Attention is all you need
Kaggle datasets
HuggingFace Transformer Models

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SmartMailGuard

Table of Contents

About the Project

Aim

Description

Tech Stack

Models and Accuracies

1. Naïve Bayes

1.1. Without N-gram Optimization

1.2. With N-gram Optimization

Toy Examples:

2. Recurrent Neural Network (RNN)

3. Long Short-Term Memory (LSTM)

4. Multinomial Naïve Bayes

5. Bidirectional Encoder Representations from Transformers (BERT)

5.1. From Scratch

5.2. Implementation from a Pre-Trained Model

Toy Examples

6. Support Vector Machine (SVM)

7. Decision Tree

8. Random Forest Classififer

File Structure

Requirements

Contributors

Mentors

Acknowledgements and Resources

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
Binary Classification		Binary Classification
Coursera Notes		Coursera Notes
Multi Intent Classification		Multi Intent Classification
SmartMailGuard Report		SmartMailGuard Report
README.md		README.md

aitwehrrg/SmartMailGuard

Folders and files

Latest commit

History

Repository files navigation

SmartMailGuard

Table of Contents

About the Project

Aim

Description

Tech Stack

Models and Accuracies

1. Naïve Bayes

1.1. Without N-gram Optimization

1.2. With N-gram Optimization

Toy Examples:

2. Recurrent Neural Network (RNN)

3. Long Short-Term Memory (LSTM)

4. Multinomial Naïve Bayes

5. Bidirectional Encoder Representations from Transformers (BERT)

5.1. From Scratch

5.2. Implementation from a Pre-Trained Model

Toy Examples

6. Support Vector Machine (SVM)

7. Decision Tree

8. Random Forest Classififer

File Structure

Requirements

Contributors

Mentors

Acknowledgements and Resources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages