Skip to content

17Esther/SMU-Encoder-Classification-Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Modal Classification of Upper Gastrointestinal Submucosal Lesions with Missing Modality

Introduction

This project proposes the SMU-Encoder Classification Model, a multi-modal classification network designed to address missing modality challenges in upper gastrointestinal submucosal lesion classification.
The model integrates style-content feature extraction through the SMUEncoder module and applies a Perceiver Transformer architecture for classification tasks.

A mix-mode training strategy (70% full-modality, 30% missing-modality simulation) is employed to improve robustness while maintaining high classification performance.

The two imaging modalities used in this project are:

  • Endoscopic Ultrasound (EUS)
  • Endoscopy (OGD, referred to as ING in the code)

For full technical details and methodology, please refer to the [final year thesis].


Experiment

The experiments are conducted on an Ubuntu 20.04 system with:

  • NVIDIA RTX 4090 GPU (24GB)
  • Intel(R) Xeon(R) Platinum 8352V CPU (16 cores, 2.10GHz)
  • 120GB system memory

The project uses PyTorch 1.11.0 + cu113 and Python 3.8, with the following key packages:

Package Version
torch 1.11.0+cu113
torchvision 0.12.0+cu113
torchsummary 1.5.1
einops 0.7.0
numpy 1.22.4
pandas 2.0.3
matplotlib 3.5.2
seaborn 0.13.2
pytorch-lamb 1.0.0
scikit-learn 1.3.2
Pillow 9.1.1
tqdm 4.61.2

File Composition

File / Folder Purpose
data_process.py Preprocess raw dataset images and masks
dataset.py Dataset loading and transformation logic
make_json.py Generate structured JSON file pairs for training/testing
train.py Model training, testing, and evaluation
perceiver.py Perceiver Transformer model implementation
vit.py Vision Transformer (ViT) model implementation
image_fusion.py Fusion of images with corresponding ROI masks
roc.py Generate ROC curves and calculate AUC scores
show.py Visualize and restore images from tensor format
visualize.py Result comparison and visualization of multiple models
generate_mask_new.py Generate binary mask images from labeled JSON files

The test1_ to test6_ folders contain different experiment variants (multi-modal / single-modal, with/without masks).
Raw data is under ./Dataset/train_Dataset and ./Dataset/test_Dataset. Processed data is stored in ./Processed_Train and ./Processed_Test.


Usage

Note: In the code, ING refers to OGD (endoscopic images). You can treat these terms as interchangeable.

Step 1: Preprocess Dataset

Run:

python data_process.py

The preprocessed data will be saved under Processed_Train and Processed_Test.


Step 2: Set Data Paths

In train.py, update the data paths within the make_json.generate_split() function to match your local dataset locations (inside Processed_Train and Processed_Test).


Step 3: Train and Test the Model

Example command:

python train.py -l 0.00078 -e 30
  • -l: Learning rate (default: 0.00078)
  • -e: Number of epochs (default: 30)
  • Batch size is configured as 8 in the code.

The function same_seeds() ensures reproducibility by setting fixed random seeds.


Step 4: Visualize Results

After training, result files (test_results.txt, train_metrics.txt, confusion matrices, and optionally ROC curves) will be generated automatically.

To visualize the model comparison, use:

python visualize.py

To visualize fused input images:

python show.py

Contact

For any questions, please contact:
Li Jiawei (李佳蔚)
📧 [email protected]

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages