example_data

Data aquisition and pre-processing

Experimental Data

Files

Biological datasets can be found in biological_data. These have been downloaded and processed with the following steps.

Aquisition and Processing

Wild-Type Embryo Data Table (Sample ID: GSM801363); GEO Accession Number: GSE32336 was downloaded as GSM801363-2802.tab

The table has been processed according to Fischer et al. (2011) and contains unique sequences and counts obtained from the FASTQ seqeunceing file (filtered for Q>20)

Header details were removed and sequences converted to FASTA format

tail -n +4 GSM801363-2802.tab > GSM801363_WTembryo.tab 

awk '{print ">" NR; print $0}' GSM801363_WTembryo.tab > GSM801363_rawseqs_WTembryo.fa

Length filtering was performed with the NGS TOOLBOX script, TBr2_length-filter.pl, from the small RNA group, Mainz University

perl TBr2_length-filter.pl -i GSM801363_rawseqs_WTembryo.fa -o biological_data/Pass_uncol.fa -min 15 -max 30
perl TBr2_length-filter.pl -i GSM801363_rawseqs_WTembryo.fa -o 26_embryo.fa -min 26 -max 26
perl TBr2_length-filter.pl -i GSM801363_rawseqs_WTembryo.fa -o 22_embryo.fa -min 22 -max 22

22G and 26G seqeunces were then selected using GREP

grep '^G' -B 1 26_embryo.fa | sed '/--/d' > biological_data/26G_uncol.fa
grep '^G' -B 1 22_embryo.fa | sed '/--/d' > biological_data/22G_uncol.fa

22G and 26G sequences were also collapsed using, TBr2_collapse.pl, in NGS TOOLBOX from the small RNA group, Mainz University

Simulated Data

Files

The files contained within simulated_data/ include:

ref.fasta; referece sequence (row 1 in alignment.csv)
reads.fasta; query sequences (rows 2-29 in alignment.csv)
alignment.pdf; a visual alignment of how the reads align to the genome

These can be run to test that stepRNA is working on your system and provide a small dataset to help understand the output.

Name		Name	Last commit message	Last commit date
parent directory ..
biological_data		biological_data
example_output		example_output
experimental_data/arabidopsis		experimental_data/arabidopsis
simulated_data		simulated_data
README.md		README.md
main.sh		main.sh
makeSpike.py		makeSpike.py
make_references.py		make_references.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Data aquisition and pre-processing

Contents:

Experimental Data

Files

Aquisition and Processing

Simulated Data

Files

FilesExpand file tree

example_data

Directory actions

More options

Directory actions

More options

Latest commit

History

example_data

Folders and files

parent directory

README.md

Data aquisition and pre-processing

Contents:

Experimental Data

Files

Aquisition and Processing

Simulated Data

Files