scClone2DR is a probabilistic multi-modal framework for predicting drug responses at the level of individual tumour clones by integrating:
- single-cell RNA sequencing (scRNA-seq)
- single-cell DNA sequencing (scDNA-seq)
- ex-vivo drug-screening data
- Multi-modal integration of scRNA-seq, scDNA-seq, and drug-screening data
- Probabilistic modelling using Pyro for Bayesian inference
- Clone-level drug response prediction in heterogeneous tumour populations
- Flexible training pipeline for real and simulated datasets
- Visualization and inference utilities for downstream analysis
You can install scClone2DR in three ways. We recommend using the VS Code Dev Container for the easiest and most reproducible workflow.
The easiest way to use scClone2DR is via a VS Code Dev Container:
-
Make sure you have:
- Visual Studio Code
- Docker running
- VS Code Dev Containers extension installed
- Cloned this repository
-
Open the project folder (the folder containing
.devcontainer/andnotebooks/) in VS Code. -
Press Ctrl+Shift+P (or Cmd+Shift+P on macOS) → search Dev Containers: Reopen in Container → press Enter.
-
VS Code will:
- Pull the
quentinduchemin/scclone2drDocker image if necessary - Mount your project folder into the container at
/workspace - Open a fully configured environment with Python, Jupyter, and all dependencies ready You have now a python environment working to use scClone2DR.
- Pull the
-
You can open folder
notebooks/directly inside VS Code and run:- first the notebook generate_fake_data to generate data that mimicks the real data used in our paper (in particular Fast Drug Pharmacoscopy data and single-cell RNA data).
- then the notebook tutorial_scClone2DR to get familiar with scClone2DR by training a model and visualizing results on the single-cell data generated at step 1.
git clone https://github.com/cbg-ethz/scClone2DR
cd scClone2DR
pip install -e .
pip install -e .[notebook]Pre-built Docker images are available on Docker Hub:
-
quentinduchemin/scclone2dr
Standard runtime image (without SSH service) -
quentinduchemin/scclone2dr_ssh
SSH-enabled runtime image (for remote/containerized workflows)
Pull the images:
docker pull quentinduchemin/scclone2dr
docker pull quentinduchemin/scclone2dr_sshRun the standard image:
docker run --rm -it \
-v $(pwd):/workspace \
-w /workspace \
quentinduchemin/scclone2drRun the SSH-enabled image (example exposing port 2222):
docker run --rm -it \
-p 2222:22 \
-v $(pwd):/workspace \
-w /workspace \
quentinduchemin/scclone2dr_sshIf you want to build the images from the repository instead of pulling them from Docker Hub:
docker build -t quentinduchemin/scclone2dr -f Dockerfile .
docker build -t quentinduchemin/scclone2dr_ssh -f Dockerfile.ssh .If you are using Docker, first start a container and run the following commands inside it.
from scclone2dr.data import RealData
from scclone2dr.pipeline import scClone2DRPipeline
from scclone2dr.trainer import Trainer, GuideType
data_source = RealData(
path_fastdrug="/path/to/FD_data.csv",
path_rna="/path/to/rna_folder/",
)
data = data_source.get_real_data(
concentration_DMSO=5,
concentration_drug=5,
)
pipeline = scClone2DRPipeline(
data_source=data_source,
trainer=Trainer(guide_type=GuideType.FULL_MVN),
mode_nu="noise_correction",
mode_theta="not shared decoupled",
)
# Configure model topology from data source metadata
pipeline.model.configure(data_source)
params = pipeline.fit(
data=data,
n_steps=600,
penalty_l1=0.1,
penalty_l2=0.1,
)
pipeline.save("checkpoints/real_data_run.npz")scClone2DR/
├── src/
│ └── scclone2dr/
│ ├── data/ # Real/simulated data loaders and dataset utilities
│ ├── baselines/ # FM / NN baseline models
│ ├── inference/ # Posterior sampling and model evaluation
│ ├── plots/ # Visualization helpers
│ ├── model.py # Core probabilistic model definition
│ ├── trainer.py # SVI training engine
│ ├── pipeline.py # End-to-end orchestration API
│ ├── types.py # Shared typing helpers
│ └── utils.py # Utility functions
├── notebooks/
├── assets/
├── Dockerfile
├── Dockerfile.ssh
├── pyproject.toml
├── setup.py
└── README.md
-
scclone2dr.data
Data modules (RealData,SimulatedData,BaseDataset) -
scclone2dr.model
Core generative model (scClone2DR) -
scclone2dr.trainer
Training engine (Trainer,GuideType) -
scclone2dr.pipeline
High-level workflow (scClone2DRPipeline) -
scclone2dr.inference
Posterior sampling and evaluation utilities -
scclone2dr.plots
Plotting and visualization functions
- Python >= 3.8
- PyTorch >= 1.10.0
- Pyro >= 1.8.0
- NumPy >= 1.20.0
- pandas >= 1.3.0
- h5py >= 3.0.0
- matplotlib >= 3.4.0
- seaborn >= 0.11.0
- tqdm >= 4.60.0
- scikit-learn >= 0.24.0
- scikit-fda >= 0.8.1
- nbformat >= 5.0.0
- plotly >= 5.0.0
For tutorials and examples: Tutorial Notebook
If you use scClone2DR in your research, please cite:
@article{scClone2DR2026,
title={Clone-level multi-modal prediction of tumour drug response},
author={Quentin Duchemin and Daniel Trejo Banos and Anne Bertolini and Pedro F. Ferreira and Rudolf Schill and Matthias Lienhard and Rebekka Wegmann and Tumor Profiler Consortium and Berend Snijder and Daniel Stekhoven and Niko Beerenwinkel and Franziska Singer and Guillaume Obozinski and Jack Kuipers},
year={2026}
}This project is licensed under the BSD 3-Clause License.
See the LICENSE file for details.
Quentin Duchemin
[email protected]
Project repository:
https://github.com/cbg-ethz/scClone2DR
