##OLD NOTES: add overlapper to RNA seq methodso detailing applications is boring, intro some philosophy> apps and what they are meant to do. discussion > impac, QA/QC, get bam file to spit out the name fo the gene and siisoalte from the non processed file. add the stuff from the github.io pages for s4hts
##TODO:
- CHeck todo list in post_hts.slurm (need to get job memory and time)
- HTS stats call have stats after every step
- add more datasets because why not? (mrnaseq is looking pretty clean)
- double check adapter trimmer reduction for all file type? Library prep?
- Update htsream version
- Fix master_parse.sh need to make for all datasets
- ask about library prep part to put in the paper.
- Add info about memory and time
- Produce a pipeline DNA, RNA, Amplicon one, SE (no super deduper for SE)
- Show algorithms do what they are supposed to do… some are straight forward.
- Experimental validation, record parameters and use them to show consistency. (MDS plot) statistic for each tool -> info about sample
- Make sure all statements applicable to nanopore/pacbio as well in regards to hts
- Use HTS stream before and after each tool (like the other ones)
##MAYBE 12. Make sure all statements applicable to nanopore/pacbio as well in regards to hts
##METHODS:
-
ena/SAMPLES(from datasets.txt/phix_datasets.txt) -> runmaster.py runs hts_master.slurm ${type} ${datasets_file} -
python runmaster.py phixOR/AND -python runmaster.py rna- output in 01-HTS_preproc -
Clean up files since array doesnt match for phix and rna (whoops) create samples.txt and phix_samples.txt files and tells you array size needed for step 3$ -
./post_hts.sh -
STAR alignment for rna type - adjust array based on output of
post_hts.sh-sbatch star_proc.slurm-./master_parse.shshould be run will call parse_output.py for each of the files to get the .json files for each alignment. - TODO: fix parse_output.py and see how all json file get to the output directory. - jupyter notebook analysis for this -
BWA Mem alignment for phix type (seq screener) - adjust array based on output of
post_hts.sh-sbatch phix_proc.slurm- TODO some thing sfor getting the flgstats stuff -
Adapter eval.py? (Using some bbmap scripts -
./randomreads.sh-./addadapters.sh -
Deduper eval.py/.R? (deduper but needed for overall methodology talk)
-
Primer eval.py (sampe as adapter eval?)
##CHECKLIST OF APPS/MODULES TESTED X = done
- X - adaptereval (Adapter eval above)
- X - qtrimmer (Star alignment above) (multiqc report for the effect on the reads to double check) (maybe deduper noise to$
- X - ntrimmer (same as q trimmer)
- X - polyatrim (same as n trimmer)
- X - seqscreener (BWA mem alignment)