Skip to content

Commit 2b5074a

Browse files
committed
tests added
1 parent 1a7ac6d commit 2b5074a

4 files changed

Lines changed: 30 additions & 19 deletions

File tree

README.md

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -210,9 +210,20 @@ as the 2nd input file for the `--mix` option).
210210

211211
As explained above, the alignments must be sorted by coordinate before they can be used as input for StringTie.
212212

213-
Optionally, a reference annotation file in GTF or GFF3 format can be provided to StringTie
214-
using the `-G` option which can be used as 'guides' for the assembly process, or their expression levels
215-
can be directly estimated (without any assembly) when the `-e` option is given.
213+
When CRAM files are used as input, the original reference genomic sequence can be provided with the `--ref` option as
214+
a multi-FASTA file with the same chromosome sequences that were used when aligning the reads. The use of `--ref` option is
215+
optional but recommended as StringTie can make use of some alignment quality data (mismatches) that may only be retrieved
216+
in the case of CRAM files when the reference genome sequence is also provided. In particular it is the assessment of junctions
217+
and their quality that may be slightly affected by omitting the `--ref` option.
218+
219+
### Reference transcripts (guides)
220+
221+
A reference annotation file in GTF or GFF3 format can be provided to StringTie
222+
using the `-G` option which can be used as 'guides' for the assembly process.
223+
224+
When the `-e` option is used (i.e. expression estimation only), this option is required,
225+
and in that case StringTie will not attempt to assemble the read alignments but instead it will
226+
only estimate the expression levels of all the transcripts provided in this file
216227

217228
Note that the reference transcripts should be fully covered by reads in order to be included
218229
in StringTie's output with the original ID of the reference transcript shown in the

run_tests.sh

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,34 +1,34 @@
11
#!/usr/bin/env bash
22

33
function unpack_test_data() {
4-
t=test_data.tar.gz
4+
t=tests.tar.gz
55
if [ ! -f $t ]; then
66
echo "Error: file $t not found!"
77
exit 1
88
fi
99
echo "..unpacking test data.."
1010
echo
1111
tar -xzf $t
12-
if [ ! -f test_data/human-chr19_P.gff ]; then
12+
if [ ! -f tests/human-chr19_P.gff ]; then
1313
echo "Error: invalid test data archive?"
1414
exit 1
1515
fi
16-
/bin/rm -f test_data.tar.gz
16+
/bin/rm -f tests.tar.gz
1717
}
1818

19-
#if [ ! -f test_data/human-chr19_P.gff ]; then
20-
if [ -f test_data.tar.gz ]; then
19+
#if [ ! -f tests/human-chr19_P.gff ]; then
20+
if [ -f tests.tar.gz ]; then
2121
#extract the tarball and rename the directory
22-
echo "..Using existing ./test_data.tar.gz"
22+
echo "..Using existing ./tests.tar.gz"
2323
unpack_test_data
2424
else
2525
echo "..Downloading test data.."
2626
#use curl to fetch the tarball from a specific github release or branch
27-
curl -sLO https://github.com/gpertea/stringtie/raw/test_data/test_data.tar.gz
27+
curl -sLO https://github.com/gpertea/stringtie/raw/test_data/tests.tar.gz
2828
unpack_test_data
2929
fi
3030
# fi
31-
cd test_data
31+
cd tests
3232
# array element format:
3333
#
3434
arrins=("short_reads" "short_reads_and_superreads" "long_reads" "long_reads" \

run_tests_valgrind.sh

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,30 @@
11
#!/usr/bin/env bash
22

33
function unpack_test_data() {
4-
t=test_data.tar.gz
4+
t=tests.tar.gz
55
if [ ! -f $t ]; then
66
echo "Error: file $t not found!"
77
exit 1
88
fi
99
echo "..unpacking test data.."
1010
echo
1111
tar -xzf $t
12-
if [ ! -f test_data/human-chr19_P.gff ]; then
12+
if [ ! -f tests/human-chr19_P.gff ]; then
1313
echo "Error: invalid test data archive?"
1414
exit 1
1515
fi
16-
#/bin/rm -f test_data.tar.gz
16+
#/bin/rm -f tests.tar.gz
1717
}
1818

19-
#if [ ! -f test_data/human-chr19_P.gff ]; then
20-
if [ -d ./test_data ]; then
19+
#if [ ! -f tests/human-chr19_P.gff ]; then
20+
if [ -d ./tests ]; then
2121
#extract the tarball and rename the directory
22-
echo "..Using existing ./test_data"
22+
echo "..Using existing ./tests"
2323
unpack_test_data
2424
else
2525
echo "..Downloading test data.."
2626
#use curl to fetch the tarball from a specific github release or branch
27-
curl -sLO https://github.com/gpertea/stringtie/raw/test_data/test_data.tar.gz
27+
curl -sLO https://github.com/gpertea/stringtie/raw/test_data/tests.tar.gz
2828
unpack_test_data
2929
fi
3030
# fi
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
The test data can be automatically retrieved by the `run_tests.sh` script included
44
with all source or binary distributions of StringTie, or downloaded separately from this url:
5-
https://github.com/gpertea/stringtie/raw/test_data/test_data.tar.gz
5+
https://github.com/gpertea/stringtie/raw/test_data/tests.tar.gz
66

77
The `run_tests.sh` script will then run StringTie on these data sets and compare the output with the
88
precomputed, expected output for each case. If the output of each test matches the

0 commit comments

Comments
 (0)