Skip to content

Given the limited dataset of Covid-19 CXRs this project aims to combine segmentation and classification models to learn important features and avoid overfitting on limited data.

Notifications You must be signed in to change notification settings

manastahir/Deep-Learning-COVID-19-Features-on-CXR-Using-Limited-Training-Data-Sets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep-Learning-COVID-19-Features-on-CXR-Using-Limited-Training-Data-Sets

Given the limited dataset of Covid-19 CXRs this project aims to combine segmentation and classification models to learn important features and avoid overfitting on limited data.

The repository provides the un-official implementaion of the paper Deep Learning COVID-19 Features on CXR Using Limited Training Data Sets.

architechure


Introduction

The data for COVID-19 CXR is very limited and appliying deep learning models to the limited datasets tends to lead to overfitting, in order to prevent the overfitting and learn relevent features the segmentation network was used to produce the segmentation mask for CXR. After applying this mask on the image N random patches of size 224x224 are extrcacted. The center of the patch has to lie inside the Lung region.The patches are then fed to a classifier and majority vote is taken to do the final classification. In the paper they proposed to use N=100. I ran the experiments on free colab and due to limited resources I had to use N=25. Due to this results were slightly worse than in the paper, but acceptable.

Segmentaion network: DenseNet103
Classification network: Imagenet pretrained ResNet18

Requirements

pip -q install requirements.txt

Data Collection and Processing

Data is collected from several different opensource resources, links are provided:

Montgomery Lung Segmentaion Dataset
JSRT/SCR Lung Segmentaion Dataset
COVID-19 CXR
Coronahack dataset

Data processing is done to processing the segmentaion masks, combine the images from all the different datasets and split the images according the ratio mention in the paper. The data processing notebook in provided in /data/ folder.

Segmentation split (file names and paths provided in /splits/Segmentaion)

Classification split (file names and paths provided in /splits/Classification)

Training

Modify the hyperparameters in config.ini file.

Trainig the segmentaion network:

python train.py --train 'SEG'

Gerating the masks:

python generate_masks.py --file '../splits/Classification/combined.csv'

Trainig the classification network:

python train.py --train 'CLASS'

Testing

python inference.py --file '../splits/Classification/test.csv'

About

Given the limited dataset of Covid-19 CXRs this project aims to combine segmentation and classification models to learn important features and avoid overfitting on limited data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published