Skip to content

cbg-ethz/demoTape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DemoTape

Computational demultiplexing of targeted single-cell DNA sequencing data

DemoTape is a computational demultiplexing method for targeted single-cell DNA sequencing (scDNA-seq) data, namely MissioBio Tapestri data, based on a distance metric between individual cells at single-nucleotide polymorphisms loci.

The corresponding preprint can be found here.

Requirements

Software

All other requirements are installed automatically via Snakemake in separate conda envs.

Resources

The following resources need to be

  • The annotation file (.bed) for the used Tapestri panel
  • dbsnp file (.bed or .txt) for the used reference genome (e.g., hg19)

Usage

Running

Only demultiplexing

To run only DemoTape, you can run:

python workflow/scripts/run_demoTape.py -i <VARIANTS.VCF> -n <NO_SAMPLES>

where <VARIANTS.VCF> is the .csv file produced by the MissionBio Mosaic Pipeline.

Alternatively, starting from the loom file, you can also first run

python workflow/scripts/mosaic_preprocessing.py -i <INPUT.LOOM>

(This is what happens if the whole DemoTape pipeline is run)

Whole pipeline (demultiplexing, assigning sample→patient, plotting, downstream analysis)

The whole DemoTape analysis pipeline can be executed via:

snakemake 
    -s workflow/Snakefile_analysis
    -j 500
    --configfile configs/MS1_analysis.yaml
    --executor slurm
    --rerun-incomplete
    --drop-metadata
    -k 
    --use-conda

According to the running environment (local/HPC), the executor needs to be adjusted.

Config

In the config file, the following variables need to be specified:

analysis:
  specific:
    input-dir: <INPUT_DIR>
    output-dir: <OUTPUT_DIR>
  general:
    panel_annotation: resources/<ANNOTATED_TAPESTRI_PANEL>.bed

output:
  prefix: <PREFIX>

The Tapestri panel file can be annotated (i.e., gene names assigned to loci) via BED Annotation.

Additionally, to run downstream analysis with BnpC or COMPASS, the corresponding software needs to be downloaded and the py/exe files. specified

Simulations

To run the simulation pipeline, execute:

snakemake 
    -s workflow/Snakefile_simulations
    -j 500
    --configfile configs/simulations.yaml
    --executor slurm
    --rerun-incomplete
    --drop-metadata
    -k 
    --use-conda

where input-looms as well as the exe files for souporcell and scSplit needs to be adjusted

About

Computational demultiplexing of targeted single-cell sequencing (tapestri) data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors