You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+42-28Lines changed: 42 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,16 +2,15 @@
2
2
3
3
## StringTie - efficient transcript assembly and quantitation of RNA-Seq data
4
4
5
-
This software employs efficient algorithms for transcript structure recovery and abundance estimation from bulk RNA-Seq reads aligned to a reference genome.
6
-
StringTie takes as input RNA-seq read alignments in coordinate-sorted SAM/BAM/CRAM format and produces a GTF output which consists of assembled
5
+
Stringtie employs efficient algorithms for transcript structure recovery and abundance estimation from bulk RNA-Seq reads aligned to a reference genome.
6
+
It takes as input RNA-seq read alignments in coordinate-sorted SAM/BAM/CRAM format and produces a GTF output which consists of assembled
7
7
transcript structures and their estimated expression levels (FPKM/TPM and base coverage values).
8
8
9
9
For additional StringTie documentation and the latest official source and binary packages please refer to the official website: <https://ccb.jhu.edu/software/stringtie>
10
10
11
11
## Obtaining and installing StringTie
12
12
13
-
Source and binary packages for this software, along with a small test data set
14
-
can be directly downloaded from the [Releases](https://github.com/gpertea/stringtie/releases) page for this repository.
13
+
Source and binary packages for this software can be directly downloaded from the [Releases](https://github.com/gpertea/stringtie/releases) page for this repository.
15
14
StringTie is compatible with a wide range of Linux and Apple OS systems.
16
15
The main program (StringTie) does not have any other library dependencies (besides zlib) and in order to compile it from source it requires
17
16
a C++ compiler which supports the C++ 11 standard (GCC 4.8 or newer).
@@ -62,13 +61,14 @@ __Note__: if the `--mix` option is used, StringTie expects two alignment files t
Note that the command line parser in StringTie allows arbitrary order and mixing of the input positional parameters with the other options of the program, so the input alignment files can precede or be given in between the other options, so the following command line if equivalent to the one above:
64
+
Note that the command line parser in StringTie allows arbitrary order and mixing of the positional parameters with the other options of the program, so the input alignment files can also precede or be given in between the other options -- the following command line is equivalent to the one above:
### Running StringTie on the provided test/demo data
71
+
72
72
When building from this source repository, after the program was compiled with `make release` as instructed above, the generated binary can be tested on a small data set with a command like this:
The above runs should take around one second each on a regular Linux or MacOS desktop.
114
-
(see also <ahref="https://github.com/gpertea/stringtie/blob/master/test_data/README.md">test_data/README.md</a>).
115
-
116
113
For very large data sets one can expect up to one hour of processing time. A minimum of 8GB of RAM is recommended for running StringTie on regular size RNA-Seq samples, with 16 GB or more being strongly advised for larger data sets.
117
114
118
115
119
116
### StringTie options
120
117
121
-
The following optional parameters can be specified (use -h/--help to get the
122
-
usage message):
118
+
The following optional parameters can be specified (use `-h` or `--help` to get the complete usage message):
123
119
```
120
+
Options:
124
121
--version : print just the version at stdout and exit
125
-
--conservative : conservative transcriptome assembly, same as -t -c 1.5 -f 0.05
126
-
--rf assume stranded library fr-firststrand
127
-
--fr assume stranded library fr-secondstrand
128
-
-G reference annotation to use for guiding the assembly process (GTF/GFF3)
122
+
--conservative : conservative transcript assembly, same as -t -c 1.5 -f 0.05
123
+
--mix : both short and long read data alignments are provided
124
+
(long read alignments must be the 2nd BAM/CRAM input file)
125
+
--rf : assume stranded library fr-firststrand
126
+
--fr : assume stranded library fr-secondstrand
127
+
-G reference annotation to use for guiding the assembly process (GTF/GFF)
128
+
--ptf : load point-features from a given 4 column feature file <f_tab>
129
129
-o output path/file name for the assembled transcripts GTF (default: stdout)
130
130
-l name prefix for output transcripts (default: STRG)
131
131
-f minimum isoform fraction (default: 0.01)
132
-
-L use long reads settings (default:false)
132
+
-L long reads processing; also enforces -s 1.5 -g 0 (default:false)
133
+
-R if long reads are provided, just clean and collapse the reads but
0 commit comments