Pre-clinical drug discovery (PDD) faces the low efficiency dilemma. One of the reasons is the lack of cross-drug efficacy evaluation infrastructure at the patient level. Here we propose Patient Multi-Drug Learning(P-MDL) task, and construct the P-MDL dataset and model zoo. The best P-MDL model DSN-adv achieve the SOTA performance in all of the 13 tumor types compared with previous SOTA models.
You can also check out our podcast and watch our intro videos (in both Chinese and English) for a quick overview.
Artificial intelligence (AI) models used for drug response prediction (DRP) tasks are generally classified into Single-Drug learning (SDL) and Multi-Drug Learning (MDL) paradigms. SDL paradigms have been adapted to the patient level and evaluate within-drug response, disregarding tumor types. However, there exist substantial differences in treatment response and survival outcomes among different tumor types, indicating that tumor type is a crucial confounding factor that can not be overlooked when predicting drug response. Additionally, SDL paradigms fail to assess cross-drug response, while MDL paradigms are currently limited to the cell line level. Therefore, we propose the P-MDL approach, which aims to achieve a comprehensive view of drug response at the patient level.
We constructed the first P-MDL dataset from publicly available data. Tumor types with relatively sufficient data were filtered out. Finally, 13 tumor types were selected for the P-MDL dataset.
P-MDL model zoo includes eight models employing different transfer learning methods:
| P-MDL models | Description |
|---|---|
| ae | Autoencoder used for encoding both of the gene expression profiles (GEPs) of cell lines and patients. |
| ae-mmd | ae model added with another mmd-loss. |
| ae-adv | ae model added with another adv-loss. |
| dsn | Domain seperation network has been successfully applied in computer vision. Here, it is used for the GEPs encoding of cell lines and patients. |
| dsn-adv | dsn model added with another adv-loss. |
| dsrn | An variant of dsn model. |
| dsrn-mmd | dsrn model added with another mmd-loss. |
| dsrn-adv | dsrn model added with another adv-loss. |
One of the P-MDL models (DSN-adv) outperforms all of the P-SDL and C-MDL models across all tumor types.
To further validate the P-MDL models and demonstrate their potential in PDD applications, the test-pairwise pre-trained DSN-adv model was used to screen 233 small molecules for patients of 13 tumor types. Take tumor type COAD as an example, most drugs were inefficient, but a few drugs showed potential efficacy for over half of COAD patients.
To support basic usage of P-MDL task, run the following commands:
conda create -n P-MDL python=3.8
conda activate P-MDL
conda install -c conda-forge rdkit
pip install torch
pip install pandas
pip install numpy
pip install sklearnHere, we provide the instruction for a quick start.
Download the data at Zenodo here. Move the download files to data/ folder.
Run:
cd code
nohup bash run_pretrain.sh 1>benchmark.txt 2>&1 &
less benchmark.txtThe benchmark.txt will look like the following:
Processing next task: 8 20230718-043332
All-data pre-training (ADP) model runing!
Start to run Test-pairwise pre-training (TPP) model!
...The bash file run_pretrain.sh will run the script P_MDL.py in a proper params setting. You can also find model Log output in records/ folder and model evaluation results in results/ folder.
Run:
cd code
nohup bash run_pretrain_for_pdr.sh 1>pdr_pretrain.txt 2>&1 &
less pdr_pretrain.txtThe pdr_pretrain.txt will look like the following:
Processing next task: 8 20230718-043332
All-data pre-training (ADP) model runing!
Start to run Test-pairwise pre-training (TPP) model!
...For PDD application, we need to predict the efficacy of all drugs to every patients. So here we set the params --select_drug_method all in the bash file run_pretrain_for_pdr.sh to recover the model which can be used for all-drug response prediction.
Then you can run the bash file run_pdr_task.sh by nohup bash run_pdr_task.sh 1>pdr_task.txt 2>&1 &, which will call the script PDR_task.py to predict the efficacy of all drugs to every patients.
As a pre-alpha version release, we are looking forward to user feedback to help us improve our framework. If you have any questions or suggestions, please open an issue or contact [[email protected]].
If you find our open-sourced code & models helpful to your research, please consider giving this repo a star🌟 and citing📑 the following article. Thank you for your support!
@misc{P_MDL_code,
author={Yushuai Wu},
title={Code of Multi-Drug-Transfer-Learning},
year={2025},
howpublished={\url{https://github.com/wuys13/Multi-Drug-Transfer-Learning.git}}
}
If you encounter problems, feel free to create an issue!



