Skip to content

abir0/Charts-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Charts-Classifier

An image classification model from data collection, cleaning, model training, deployment and API integration.

Table of Contents
  1. Model Overview
  2. Dataset Preparation
  3. Training and Data Cleaning
  4. Model Inference
  5. Model Deployment

Model Overview

The model can classify 28 different types of charts and diagrams in raster image formats (png, jpg, gif, etc.).

The types are following:

  1. arc diagram
  2. area chart
  3. bar chart
  4. block diagram
  5. boxplot
  6. bubble chart
  7. cartogram
  8. control chart
  9. dendrogram
  10. flowchart
  11. funnel chart
  12. gantt chart
  13. heatmap
  14. histogram
  15. line graph
  16. matrix diagram
  17. mind map
  18. network graph
  19. neural network diagram
  20. organogram
  21. phase diagram
  22. pie chart
  23. radar chart
  24. scatter plot
  25. snakey chart
  26. surface plot
  27. timeline chart
  28. venn diagram

Dataset Preparation

Data Collection: The image dataset was downloaded from DuckDuckGo search engine API using keywords (28 class names).

DataLoaders: fastai DataBlock API was used to set up the DataLoaders.

Data Augmentation: fastai provides default data augmentation which operates in GPU.

Details can be found in notebooks/data_collection_and_augmentation.ipynb of the GitHub repo.


Example images from the training dataset


Example images from the validation dataset

Training and Data Cleaning

Training: Training was done on a pre-trained model (resnet34) and it was fine-tuned for 6 epochs with accuracy upto ~85% .

Data Cleaning: Since the data was collected from DuckDuckGo search engine API, there were many noises and inconsistencies within the dataset. Hence, the data was cleaned and updated using the fastai ImageClassifierCleaner. The data was cleaned each time after training or fine-tuning until the final iteration of the model.

Details can be found in notebooks/model_training_and_cleaning.ipynb of the GitHub repo.

Model Inference

The model was exported as a .pkl file and was used for inference.

Details can be found in notebooks/model_inference.ipynb of the GitHub repo.

Model Deployment

The model was deployed to Hugging Face Spaces as a gradio app. The implementation can be found here.


Model deployed in Hugging Face Spaces

API integration with GitHub Pages

The deployed model API is integrated here in GitHub Pages Website. Implementation and other details can be found in docs folder.


Documentation in GitHub Pages

About

An image classification project to classify various charts or diagrams.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published