This repository provides the code for our paper MedCycle (Findings of NAACL 2024).
conda create --name medcycle python=3.7
conda activate medcycle
conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=10.2 -c pytorch
pip install pandas scikit-learn pycocoevalcap tqdm
pip install wandb # optional
- Download MIMIC-CXR, CheXpert and PadChest datasets.
- Follow the preprocessing steps in
data_preprocessing
The <CSV_PATH> is the path to the .csv file generated by the preprocessing steps. The IMAGE_DIR is different for the training and test sets as the datasets are unpaired.
Train
python main.py --ann_path <CSV_PATH> --image_dir <IMAGE_DIR_TRAIN> --image_dir_test <IMAGE_DIR_TEST>
Note: you can set --wandb 1 for Weights & Biases monitoring.
Continue from checkpoint
python main.py --ann_path <CSV_PATH> --image_dir <IMAGE_DIR_TRAIN> --image_dir_test <IMAGE_DIR_TEST> --resume <CHECKPOINT_PATH>
Please consider citing our paper if the project helps your research.
@inproceedings{hirsch-etal-2024-medcycle,
title = "{M}ed{C}ycle: Unpaired Medical Report Generation via Cycle-Consistency",
author = "Hirsch, Elad and
Dawidowicz, Gefen and
Tal, Ayellet",
editor = "Duh, Kevin and
Gomez, Helena and
Bethard, Steven",
booktitle = "Findings of the Association for Computational Linguistics: NAACL 2024",
month = jun,
year = "2024",
address = "Mexico City, Mexico",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.findings-naacl.125",
pages = "1929--1944",
}
We thank the authors of R2GEN-CMN, both for their research and for sharing their code. Our repository is built upon their project.