- Model files were too large hence they were removed from code submission
- Model Files are committed at :
- classification/model_originial/densenet.pth -- Original CheXpert dataset
- classification/model_localized/densenet.pth -- Localized dataset
- localization/code/trained_model.hdf5 -- Pre-trained model
- localization/code/model.009.hdf5 -- Self-trained model
Localization contains code for extracting localized lung region images from the input chest X-ray images.
- We have used this [U-Net implementation] (https://github.com/imlab-uiip/lung-segmentation-2d) and modified their code where needed for data load or prediction.
- We have also used their pre-trained model for prediction purposes.
- Setup the conda environment using
conda env create <environment yml file> - To run on CPU, use
code/env_cpu/environment.ymlas environment yml file - To run on GPU, use
code/env_gpu/environment.ymlas environment yml file
- JSRT dataset for 247 chest-Xray images
- Corresponding left and right lung region masks from SCR database
- Chexpert images to predict the lung region masks using the model
- Above links should be used to download the data and provide the path to downloaded data in the code before trigerring a run
- Run
code/preprocess.py:- to perform histrogram equalization on JSRT chest-Xray images
- to combine left and right lung masks into single image
- replace
jsrt_pathvariable with JSRT dataset path - replace
left_lungs_mask_pathwith path of left lung mask images from SCR database - replace
right_lungs_mask_pathwith path of right lung mask images from SCR database - replace
preprocess_output_pathvariable with pre-processing output directory. This directory should be created before running the code.
- Run
code/train_model.pyto train the model for generating lung masks using U-Net implementation with:- preprocessed JSRT chest-Xray images as X (input vector)
- preprocessed single image for left and right lung masks as Y (output vector)
- To run the file, replace
pathvariable with JSRT dataset path
- Run
code/inference.pyto use the model for generating lung masks from Chexpert Images. To run the file:- replace
pathvariable with Chexpert dataset path - set
batchvariable value as 'train' or 'valid' - best perfoming model is commited as file named
model.009.hdf5 - pre-trained model is commited as file named
trained_model.hdf5
- replace
Classification contains code for predicting diseases from images with labels.
- Setup the conda environment using
conda env create <environment yml file> - Where environment file is,
code/environment.yml
- Chexpert overlay dataset containing 28,929 chest X-ray images
- Chexpert original dataset corresponding to the above 28,929 overlay images
- Above links should be used to download the data and provide the path to downloaded data in the code before trigerring a run
- Run
code/python etl_chexpert_data.py -hto get full detailed command usage: etl_chexpert_data.py [-h] -c CSV_PATH -p PREFIX_PATH -d DEST_PATH -o OVERLAY
divide the data in train validate and test
optional arguments: -h, --help show this help message and exit -c CSV_PATH Path to file containing file name and labels -p PREFIX_PATH Path to directory containing image dataset -d DEST_PATH Path to output directory -o OVERLAY Is overlay images Y for yes N for no!
- Run
code/python train_densenet.py -husage: train_densenet.py [-h] -p PREFIX_PATH
train densenet 121 with 9 epocs and batch size of 50
optional arguments: -h, --help show this help message and exit -p PREFIX_PATH Path to directory containing image dataset