Multi-Modal Classification of Upper Gastrointestinal Submucosal Lesions with Missing Modality

Introduction

This project proposes the SMU-Encoder Classification Model, a multi-modal classification network designed to address missing modality challenges in upper gastrointestinal submucosal lesion classification.
The model integrates style-content feature extraction through the SMUEncoder module and applies a Perceiver Transformer architecture for classification tasks.

A mix-mode training strategy (70% full-modality, 30% missing-modality simulation) is employed to improve robustness while maintaining high classification performance.

The two imaging modalities used in this project are:

Endoscopic Ultrasound (EUS)
Endoscopy (OGD, referred to as ING in the code)

For full technical details and methodology, please refer to the [final year thesis].

Experiment

The experiments are conducted on an Ubuntu 20.04 system with:

NVIDIA RTX 4090 GPU (24GB)
Intel(R) Xeon(R) Platinum 8352V CPU (16 cores, 2.10GHz)
120GB system memory

The project uses PyTorch 1.11.0 + cu113 and Python 3.8, with the following key packages:

Package	Version
torch	1.11.0+cu113
torchvision	0.12.0+cu113
torchsummary	1.5.1
einops	0.7.0
numpy	1.22.4
pandas	2.0.3
matplotlib	3.5.2
seaborn	0.13.2
pytorch-lamb	1.0.0
scikit-learn	1.3.2
Pillow	9.1.1
tqdm	4.61.2

File Composition

File / Folder	Purpose
`data_process.py`	Preprocess raw dataset images and masks
`dataset.py`	Dataset loading and transformation logic
`make_json.py`	Generate structured JSON file pairs for training/testing
`train.py`	Model training, testing, and evaluation
`perceiver.py`	Perceiver Transformer model implementation
`vit.py`	Vision Transformer (ViT) model implementation
`image_fusion.py`	Fusion of images with corresponding ROI masks
`roc.py`	Generate ROC curves and calculate AUC scores
`show.py`	Visualize and restore images from tensor format
`visualize.py`	Result comparison and visualization of multiple models
`generate_mask_new.py`	Generate binary mask images from labeled JSON files

The test1_ to test6_ folders contain different experiment variants (multi-modal / single-modal, with/without masks).
Raw data is under ./Dataset/train_Dataset and ./Dataset/test_Dataset. Processed data is stored in ./Processed_Train and ./Processed_Test.

Usage

Note: In the code, ING refers to OGD (endoscopic images). You can treat these terms as interchangeable.

Step 1: Preprocess Dataset

Run:

python data_process.py

The preprocessed data will be saved under Processed_Train and Processed_Test.

Step 2: Set Data Paths

In train.py, update the data paths within the make_json.generate_split() function to match your local dataset locations (inside Processed_Train and Processed_Test).

Step 3: Train and Test the Model

Example command:

python train.py -l 0.00078 -e 30

-l: Learning rate (default: 0.00078)
-e: Number of epochs (default: 30)
Batch size is configured as 8 in the code.

The function same_seeds() ensures reproducibility by setting fixed random seeds.

Step 4: Visualize Results

After training, result files (test_results.txt, train_metrics.txt, confusion matrices, and optionally ROC curves) will be generated automatically.

To visualize the model comparison, use:

python visualize.py

To visualize fused input images:

python show.py

Contact

For any questions, please contact:
Li Jiawei (李佳蔚)
📧 [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
SMU_Encoder_Model		SMU_Encoder_Model
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
generate_mask_new.py		generate_mask_new.py
show.py		show.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Modal Classification of Upper Gastrointestinal Submucosal Lesions with Missing Modality

Introduction

Experiment

File Composition

Usage

Step 1: Preprocess Dataset

Step 2: Set Data Paths

Step 3: Train and Test the Model

Step 4: Visualize Results

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Multi-Modal Classification of Upper Gastrointestinal Submucosal Lesions with Missing Modality

Introduction

Experiment

File Composition

Usage

Step 1: Preprocess Dataset

Step 2: Set Data Paths

Step 3: Train and Test the Model

Step 4: Visualize Results

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages