Skip to content

Latest commit

 

History

History

README.md

A minimal, reproducible example of analyzing wellDA-seq data

In this minimal, reproducible example of wellDA-seq, we will address the following questions/tasks:

  1. how to preprocess the wellDA-seq data to create analysis-ready data objects?
  2. how to identify subclones and cell types (or states)?
  3. how to investigate the interplay of CNA events and open/closed chromatin regions (e.g., the GtoE and EbyG scores, and the plasticity/heritability score)?

This tutorial contains all the analysis that we performed on each of the samples in the manuscript.

Software requirement

In brief, the following key tools are required:

  • preprocessing the DNA data by single-cell CNV pipeline.
  • preprocessing the ATAC data by scATAC-pro and ArchR.
  • creating an analysis-ready DNA object by copykit.
  • creating an analysis-ready ATAC object by Signac.

See the folder wellDA-seq/install for detailed instruction of installing software requirement.

Outline

Here, we use the real sample P8 (DCIS66T_chip2) in the manuscript for demonstration.

1. Proprocessing

See the detailed instruction in 01.preprocessing.md

Input: FASTQ files of the DNA and ATAC modality data.

Expected result:

  • a preliminary Signac object for ATAC
  • a preliminary Copykit object for DNA
item ATAC DNA
Cell Dispensing txt
Basic Quality Control txt txt
Basic Clustering txt txt

2. Initiating the wellDA data

See the detailed instruction in 02.wellDA_initiation.md

Expected result:

  • a folder of wellDA data
├── metadata.csv      <--- data frame of single-cell metadata
├── metadata.df.rds   <--- data frame of single-cell metadata
├── obja.rds          <--- Signac object
├── objd.rds          <--- Copykit object

txt

3. Further analyze the ATAC part

See the detailed instruction in 03.wellDA_scATAC_annotation.md

Expected result:

  • a Signac object with cell types / states annotated.

txt

4. Further analyze the CNA part

See the detailed instruction in 04.wellDA_scCNA_annotation.md

Expected result:

  • a Copykit object with the diploid and low-quality cells removed and the tentative subclones determined.
Before data cleaning the aneuploid cells After data cleaning the aneuploid cells

5. Refine the wellDA data

See the detailed instruction in 05.wellDA_refine.md

Expected result:

  • an analysis-ready folder of wellDA-seq data for each sample

txt

6. GtoE and EbyG score

See the detailed instruction in 06.GtoE_EbyG.md

Expected result:

  • GtoE and EbyG scores for each sample

7. Global concordance score between genotypes and chromatin accessibility profiles

See the detailed instruction in 07.global_concordance.md

Expected result:

  • a global concordance score for each sample

8. Heritable and plastic tumor phenotypes

See the detailed instruction in 08.plasticity_heritability.md

Expected result:

  • Heritable/plastic score of tumorigenesis-related gene signatures for each sample