You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+95-24Lines changed: 95 additions & 24 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,40 +1,36 @@
1
1
## Obtaining and installing StringTie
2
2
3
-
In order to build StringTie from this GitHub repository
4
-
the following steps can be taken:
3
+
Source and binary packages for this software, along with a small test data set
4
+
can be directly downloaded from the <ahref="https://github.com/mpertea/stringtie2/releases">Releases</a> page for this repository. StringTie is compatible with a wide range of Linux and Apple OS systems (going as far back as RedHat Enterprise Linux 5.0 and OS X 10.7). The main program (StringTie) does not have any other library dependencies and in order to compile it from source it requires only a C++ compiler which supports the C++ 0x standard (GCC 4.5 or newer).
5
+
6
+
### Building the latest version from the repository
7
+
In order to compile the StringTie source in this GitHub repository the following steps can be taken:
5
8
6
9
```
7
10
git clone https://github.com/mpertea/stringtie2
8
11
cd stringtie2
9
12
make release
10
13
```
11
14
12
-
Note that simply running `make` will produce an executable
13
-
which is more suitable for debugging and runtime checking but which can be
14
-
significantly slower than the optimized version which is obtained by using
15
-
`make release`.
15
+
If the compilation is successful, the resulting `stringtie` binary can then be copied to
16
+
a programs directory of choice.
16
17
18
+
Installation of StringTie this way should take less than a minute on a regular Linux or Apple MacOS
19
+
desktop.
17
20
18
-
### Installation of the super-reads module (optional)
21
+
Note that simply running `make` would produce an executable which is more suitable for debugging
22
+
and runtime checking but which can be significantly slower than the optimized version which
23
+
is obtained by using `make release` as instructed above.
19
24
20
-
This optional module can be used to de-novo assemble, align and pre-process
21
-
RNA-Seq reads, preparing them to be used as "super-reads" by Stringtie.
25
+
### Using pre-compiled (binary) releases
26
+
Instead of compiling from source, some users may prefer to download an already compiled binary for Linux
27
+
and Apple OS X, ready to run. These binary package releases are compiled on older versions of these
28
+
operating systems (RedHat Enterprise Linux 5.0 and OS X 10.7) in order to provide compatibility with
29
+
a wide range of (older) OS versions, not just the most recent versions. These precompiled packages are
30
+
made available on the <ahref="https://github.com/mpertea/stringtie2/releases">Releases</a> page for this repository.
31
+
Please note that these binary packages do not include the optional [super-reads module](#the-super-reads-module),
32
+
which currently can only be built on Linux machines, from the source made available in this repository.
22
33
23
-
Mode detailed information is provided in the SuperReads_RNA/README.md
24
-
Quick installation instructions for this module (assuming the above Stringtie installation
25
-
was completed):
26
-
27
-
```
28
-
cd SuperReads_RNA
29
-
./install.sh
30
-
```
31
-
32
-
#### Using super-reads with Stringtie
33
-
34
-
After running the super-reads module (see the SuperRead_RNA/README.md file for usage details), there
35
-
is a BAM file which contains sorted alignment for both short reads and super-reads, called *`sr_merge.bam`*,
36
-
created in the selected output directory. This file can be directly given as the main input file
37
-
to StringTie as described in the _Running Stringtie_ section below.
38
34
39
35
## Running StringTie
40
36
@@ -46,6 +42,56 @@ The main input of the program is a SAMTools BAM file with RNA-Seq mappings
46
42
sorted by genomic location (for example the accepted_hits.bam file produced
47
43
by TopHat).
48
44
45
+
The main output of the program is a GTF file containing the structural definitions of the transcripts assembled by StringTie from the read alignment data. The name of the output file should be specified by with the `-o` option.
46
+
47
+
### Running StringTie on the provided test/demo data
48
+
When building from this source repository, after the program was compiled with `make release` as instructed above, the generated binary can be tested on a small data set with a command like this:
49
+
```
50
+
make test
51
+
```
52
+
This will run the included `run_tests.sh` script which downloads a small test data set
53
+
and runs a few simple tests to ensure that the program works and generates the expected output.
54
+
55
+
If a pre-compiled package is used instead of compiling the program from source, the `run_tests.sh` script is included in the binary package as well and it can be run immediately after unpacking the binary package:
56
+
57
+
```
58
+
tar -xvzf stringtie-2.0.Linux_x86_64.tar.gz
59
+
cd stringtie-2.0.Linux_x86_64
60
+
./run_tests.sh
61
+
```
62
+
63
+
These small test/demo data sets can also be downloaded separately as <ahref="https://github.com/mpertea/stringtie2/releases/download/v2.0/test_data.tar.gz">test_data.tar.gz</a> along with the source package and pre-compiled packages on the <ahref="https://github.com/mpertea/stringtie2/releases">Releases</a> page for this repository.
64
+
65
+
The tests can also be run manually as shown below (after changing to the _test_data_ directory, `cd test_data`):
66
+
67
+
#### Run 1: Input consists of only alignments of short reads
68
+
```
69
+
stringtie -o short_reads.out.gtf short_reads.bam
70
+
```
71
+
72
+
#### Run 2: Input consists of alignments of short reads and superreads
The above runs should take around one second each on a regular Linux or MacOS desktop.
88
+
(see also <ahref="https://github.com/mpertea/stringtie2/blob/master/test_data/README.md">test_data/README.md</a>).
89
+
90
+
For very large data sets one can expect up to one hour of processing time. A minimum of 8GB of RAM is recommended for running StringTie on regular size RNA-Seq samples, with 16 GB or more being strongly advised for larger data sets.
91
+
92
+
93
+
### StringTie options
94
+
49
95
The following optional parameters can be specified (use -h/--help to get the
50
96
usage message):
51
97
```
@@ -134,3 +180,28 @@ _`reference_id`_ GTF attribute in the output file . Other transcripts assembled
134
180
the input alignment data by StringTie and not present in the reference file will be
135
181
printed as well ("novel" transcripts).
136
182
183
+
## The super-reads module
184
+
185
+
This optional module can be used to de-novo assemble, align and pre-process
186
+
RNA-Seq reads, preparing them to be used as "super-reads" by Stringtie.
Quick installation instructions for this module from the source available on this repository
191
+
(assuming the above Stringtie installation was completed):
192
+
193
+
```
194
+
cd SuperReads_RNA
195
+
./install.sh
196
+
```
197
+
198
+
### Using super-reads with Stringtie
199
+
200
+
After running the super-reads module (see the <ahref="https://github.com/mpertea/stringtie2/blob/master/SuperReads_RNA/README.md">SuperReads_RNA</a> module documentation for usage details), there
201
+
is a BAM file which contains sorted alignment for both short reads and super-reads, called *`sr_merge.bam`*,
202
+
created in the selected output directory. This file can be directly given as the main input file
203
+
to StringTie as described in the [Running StringTie](#running-stringtie) section above.
204
+
205
+
206
+
## License
207
+
StringTie is free, open source software released under an <ahref="https://opensource.org/licenses/MIT">MIT License</a>.
0 commit comments