Skip to content

Diagnosing Covid-19 from Chest X-ray and CT images. Classification is conducted using a ResNet50, and class imbalance is addressed through upsampling and loss function-weighting through VAE-predicted sample scores.

Notifications You must be signed in to change notification settings

jwf40/Improving-Covid-19-Prediction-Through-Variational-Bayes-For-Learned-Class-Weighting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

Improving Covid-19 Prediction Through Variational Bayes For Learned Class Weighting

Current Covid-19 datasets are rife with class imbalance. This study identifies the consequences of such imbalance on Covid-19 diagnosis tasks with a CNN (resnet architecture), as well as presenting both existing and novel solutions to these problems. The project demonstrates the limitations and advantages of each approach, as well as suggesting further work (please read the paper for details).

Dataset Distribution

The CovidX dataset was used for training all networks. At the time of producing this project, the dataset was heavily imbalanced:

As a result, network performance on the minor class of Covid-19 samples was poor.

Existing Methods

The first strategies leveraged were data augmentation (adding perturbations to samples to increase variance within the dataset), upsampling (sampling the minority-class at a greater proportion to its size) and loss-function weighting (applying greater weights to the loss function for the minority class). By combining all three strategies, the performance of network on the minority class improved. However, the performance on the majority classes was worse (in other words, the network simply predicted Covid-19 more frequently, instead of actually learning the features of Covid-19).

Variational Bayes

Through using a variational autoencoder and Bayesian statistics, it is possible to estimate the difficulty of a classification for each sample 1. This project took this notion and presents a novel application of the methodolgy for dynamic, real-time and per-sample loss-function weighting and upsampling. This created a more robust network, that improved on minority-class samples without sacrificing majority-class performance.

MoCo and Pretraining

Finally, the efficacy of pretraining the CNN on large existing databases was explored (specifically, traditional pretraining on ImageNet and Momentum Contrast Learning against the Instagram 1bn dataset). This resulted in a signficantly stronger performance on minority-class samples, but lead to an overall reduction in performance. The intuition is that the generic feature extractor is not biased by the dataset imbalance, however the lack of training against chest X-ray images results in worse performance overall.

About

Diagnosing Covid-19 from Chest X-ray and CT images. Classification is conducted using a ResNet50, and class imbalance is addressed through upsampling and loss function-weighting through VAE-predicted sample scores.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages