hi-UNI

Official code for Prediction of molecular subtypes for endometrial cancer based on hierarchical foundation model. Bioinformatics

Journal link | Cite

hi-UNI: hierarchical UNI is used for whole slide image classification, using a weakly supervised pipeline. Our method achieved state-of-the-art performance, offering cost-effective and fast molecular subtyping for endometrial cancer.

Overview

Installation

Install the dependencies

pip install -r requirements.txt

Preprocessing

We have uploaded another repo for data preprocessing: WSI_Segmenter. Which can also be found in the ./preprocess directory. The detailed patch extraction and segmentation steps can be found in the ./preprocess/readme.md.
Extract raw patches to at least 1024x1024 resolution, use tiatoolbox or DeepZoom for patch extraction. The tumor segmentation network can be easily added to these pipelines.

Data preparation

Prepare the data in the following structure, png or jpeg format is supported. Note that extracting patches only from the tumor region is recommended.

├── data
│   ├── slide_1
│   │   ├── patch_1.png
│   │   ├── patch_2.png
│   │   ├── ...
│   ├── slide_2
│   │   ├── patch_1.png
│   │   ├── patch_2.png
│   │   ├── ...
│   ├── ...
│   └── slide_n
│       ├── ...
│       └── patch_n.png

Create a hierarchical structure for the data.
```
python utils/create_hi_patches.py --input <INPUT_DIR> --output <OUTPUT_DIR> --how non-blank
```
--how : center (center-crop) or non-blank (selective-sampling, proposed in the paper)
Organize your data like example.csv. Create k-fold split for the data.
```
python utils/gen_kfold_split.py --csv <CSV_PATH>  --dir <STEP_2_OUTPUT_DIR> --k 5 --on slide
```
--on slide split the data on slide level

--on patient split the data on patient level (use name column)

A directory named kf will be created in the current directory.
Apply for the UNI model from and download the pytorch_model.bin.
Modify the config.yaml file to set hyperparameters and UNI's storage path.
- Hyperparameters: batch_size, lr, epochs, iters_to_val, save_best
- UNI config: freeze_ratio (for ViT blocks), cmb (hi-UNI combinations), UNI_path
- Task-specific config: class_names

Train and evaluate

Train & evaluate a single fold (e.g., fold 1) and evaluate on the validation set
```
python train.py --fold 1
```
Train & evaluate all folds (for Windows)
```
python ./scripts/train_kf.py
```
Train & evaluate all folds (for Linux)
```
sh ./scripts/train_kf.sh
```

The results will be saved in the runs/ directory.

In the format of:

 ├── runs
 │   ├── {cmbs}_{freeze_ration}  # configuration
 │   │   ├── 1  # fold name
 │   │   │   ├── {fold}_best.pth  # best model
 │   │   │   ├── slide_{iter}.png  # slide-level ROC
 │   │   │   ├── ...
 │   │   ├── ...
 │   ...

Comparison experiments

We are grateful to the authors for sharing their code. We use CLAM for data preprocessing and feature extraction in comparison experiments.

Model	Authors	GitHub link
CLAM	Lu et al.	https://github.com/mahmoodlab/CLAM
DTFD-MIL	Zhang et al.	https://github.com/hrzhang1123/DTFD-MIL
SETMIL	Zhao et al.	https://github.com/Louis-YuZhao/SETMIL
TransMIL	Shao et al.	https://github.com/szc19990412/TransMIL
im4MEC	Fremond et al.	https://github.com/AIRMEC/im4MEC

License

Reference

If you find our work useful in your research, please consider citing our paper:

Haoyu Cui, Qinhao Guo, Jun Xu, Xiaohua Wu, Chengfei Cai, Yiping Jiao, Wenlong Ming, Hao Wen, Xiangxue Wang, Prediction of molecular subtypes for endometrial cancer based on hierarchical foundation model, Bioinformatics, 2025

@article{10.1093/bioinformatics/btaf059,
    author = {Cui, Haoyu and Guo, Qinhao and Xu, Jun and Wu, Xiaohua and Cai, Chengfei and Jiao, Yiping and Ming, Wenlong and Wen, Hao and Wang, Xiangxue},
    title = {Prediction of molecular subtypes for endometrial cancer based on hierarchical foundation model},
    journal = {Bioinformatics},
    pages = {btaf059},
    year = {2025},
    month = {02},
    issn = {1367-4811},
    doi = {10.1093/bioinformatics/btaf059},
    url = {https://doi.org/10.1093/bioinformatics/btaf059},
}

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
preprocess @ 08cba33		preprocess @ 08cba33
scripts		scripts
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
config.yaml		config.yaml
example.csv		example.csv
model.py		model.py
readme.md		readme.md
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hi-UNI

Overview

Installation

Preprocessing

Data preparation

Train and evaluate

Comparison experiments

License

Reference

About

Uh oh!

Releases 2

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

hi-UNI

Overview

Installation

Preprocessing

Data preparation

Train and evaluate

Comparison experiments

License

Reference

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages