WASP Snakemake Pipeline 1.0.0

Authors

Jeongho Chae
Benjamin McMichael
Date: 2025-01-24

Quickstart

This analysis utilises a snakemake pipeline to process ATAC-seq data. Once the pipeline has been cloned to the analysis directory (preferably in scratch space) using the command:

git clone https://sc.unc.edu/dept-fureylab/wasp_chromatin.git

There's no only prerequisite for running and the command for executing the WASP_chromatin pipeline is:

sbatch Snakemake_SLURMsubmission.sbatch

Update previous scripts

Updated module versions to reflect newer versions of tools
Organized the temp directory and final results directory for clarify
Implement the moveout function

Available Module Versions (from project_config.yaml)

Used Previous versions for comparison with past WASP results.
Pervious versions should be used together rather than mixing them with current versions.

WASP
- Previous WASP version : 2019.12
- Current WASP version : 2023.02
python
- Previous python version : 3.6.6
- Current python version : 3.9.6
bowtie2
- Previous bowtie2 version : 2.4.1
- Current bowtie2 version : 2.4.5
samtools
- Previous samtools version : 1.12
- Current samtools version : 1.21

Snakemake Pipeline

Updated Perl scripts and created a snakemake pipeline
Scripts used previously in the analysis of a subset of ATAC data can be found here:

/proj/fureylab/projects/CD_allelic_imbalance/wasp/wasp_scripts

Existing genotype data

WASP uses genotype data to infer snp's and allelic usage
Genotype data for use with WASP can be found here:

/proj/fureylab/data/Genotypes/human/imputed_vcfs_hg38

Pipeline Rules

Implemented WASP so that the rmdup_pe rule is not executed redundantly, assuming that the reads for WASP have already had duplicates removed. However, if removeDupReads is set to TRUE in project_config.yaml, the remove duplicates rule will be executed.

Rule all

Defines the final expected output files.

Rule find_intersecting_snps

Identifies sequencing reads that overlap known SNPs.
Generates three sets of reads:
- Reads that require remapping (alternative alleles substituted) -> FASTQ file
- Reads that require remapping (reference alleles retained) -> BAM file
- Reads that do not require remapping -> BAM file

Rule remap_bowtiew2

Re-aligns the remapping reads to the reference genome using Bowtie2.
Produces a BAM file with newly mapped reads.

Rule sort_index_remapBam

Sorts and indexes the remapped BAM file for efficient processing.

Rule filter_remapped_reads

Compares remapped reads with their original versions.
Discard reads that do not map back to their original locations, reducing allele-specific mapping bias.

Rule merge_bams

Merges the filtered remapped BAM file with the original non-remapped BAM file generated from Rule 2.

For Rule 7 and 8, the rules will be executed only when removeDupReads is set to TRUE in project_config.yaml

Rule sort_index_mergeBam

Sorts and indexes the merged BAM file for efficient processing.

Rule rmdup_pe

Removes PCR duplicates from paired-end reads.
Keeps only unique reads to prevent bias in allele-spcific analysis.

Rule sort_index_rmdup_pe

Sorts and indexes the final BAM file for downstream analysis.

moveOutFiles

Moves final results files for locally run samples to permenant space. set moveOutFiles to TRUE in project_config.yaml after running the pipeline, checking that everything had run correctly, and then rerun the pipeline using the same submission statement.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Samples		Samples
README.md		README.md
Snakefile		Snakefile
Snakemake_SLURMsubmission.sbatch		Snakemake_SLURMsubmission.sbatch
cluster_config.json		cluster_config.json
project_config.yaml		project_config.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WASP Snakemake Pipeline 1.0.0

Authors

Quickstart

Update previous scripts

Available Module Versions (from project_config.yaml)

Snakemake Pipeline

Existing genotype data

Pipeline Rules

moveOutFiles

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WASP Snakemake Pipeline 1.0.0

Authors

Quickstart

Update previous scripts

Available Module Versions (from project_config.yaml)

Snakemake Pipeline

Existing genotype data

Pipeline Rules

moveOutFiles

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages