Skip to content

Commit b6f0268

Browse files
committed
sync mpertea/stringtie2 changes
1 parent d84cf1d commit b6f0268

404 files changed

Lines changed: 178357 additions & 142 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README

Lines changed: 0 additions & 112 deletions
This file was deleted.

README.md

Lines changed: 95 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,40 +1,36 @@
11
## Obtaining and installing StringTie
22

3-
In order to build StringTie from this GitHub repository
4-
the following steps can be taken:
3+
Source and binary packages for this software, along with a small test data set
4+
can be directly downloaded from the <a href="https://github.com/mpertea/stringtie2/releases">Releases</a> page for this repository. StringTie is compatible with a wide range of Linux and Apple OS systems (going as far back as RedHat Enterprise Linux 5.0 and OS X 10.7). The main program (StringTie) does not have any other library dependencies and in order to compile it from source it requires only a C++ compiler which supports the C++ 0x standard (GCC 4.5 or newer).
5+
6+
### Building the latest version from the repository
7+
In order to compile the StringTie source in this GitHub repository the following steps can be taken:
58

69
```
710
git clone https://github.com/mpertea/stringtie2
811
cd stringtie2
912
make release
1013
```
1114

12-
Note that simply running `make` will produce an executable
13-
which is more suitable for debugging and runtime checking but which can be
14-
significantly slower than the optimized version which is obtained by using
15-
`make release`.
15+
If the compilation is successful, the resulting `stringtie` binary can then be copied to
16+
a programs directory of choice.
1617

18+
Installation of StringTie this way should take less than a minute on a regular Linux or Apple MacOS
19+
desktop.
1720

18-
### Installation of the super-reads module (optional)
21+
Note that simply running `make` would produce an executable which is more suitable for debugging
22+
and runtime checking but which can be significantly slower than the optimized version which
23+
is obtained by using `make release` as instructed above.
1924

20-
This optional module can be used to de-novo assemble, align and pre-process
21-
RNA-Seq reads, preparing them to be used as "super-reads" by Stringtie.
25+
### Using pre-compiled (binary) releases
26+
Instead of compiling from source, some users may prefer to download an already compiled binary for Linux
27+
and Apple OS X, ready to run. These binary package releases are compiled on older versions of these
28+
operating systems (RedHat Enterprise Linux 5.0 and OS X 10.7) in order to provide compatibility with
29+
a wide range of (older) OS versions, not just the most recent versions. These precompiled packages are
30+
made available on the <a href="https://github.com/mpertea/stringtie2/releases">Releases</a> page for this repository.
31+
Please note that these binary packages do not include the optional [super-reads module](#the-super-reads-module),
32+
which currently can only be built on Linux machines, from the source made available in this repository.
2233

23-
Mode detailed information is provided in the SuperReads_RNA/README.md
24-
Quick installation instructions for this module (assuming the above Stringtie installation
25-
was completed):
26-
27-
```
28-
cd SuperReads_RNA
29-
./install.sh
30-
```
31-
32-
#### Using super-reads with Stringtie
33-
34-
After running the super-reads module (see the SuperRead_RNA/README.md file for usage details), there
35-
is a BAM file which contains sorted alignment for both short reads and super-reads, called *`sr_merge.bam`*,
36-
created in the selected output directory. This file can be directly given as the main input file
37-
to StringTie as described in the _Running Stringtie_ section below.
3834

3935
## Running StringTie
4036

@@ -46,6 +42,56 @@ The main input of the program is a SAMTools BAM file with RNA-Seq mappings
4642
sorted by genomic location (for example the accepted_hits.bam file produced
4743
by TopHat).
4844

45+
The main output of the program is a GTF file containing the structural definitions of the transcripts assembled by StringTie from the read alignment data. The name of the output file should be specified by with the `-o` option.
46+
47+
### Running StringTie on the provided test/demo data
48+
When building from this source repository, after the program was compiled with `make release` as instructed above, the generated binary can be tested on a small data set with a command like this:
49+
```
50+
make test
51+
```
52+
This will run the included `run_tests.sh` script which downloads a small test data set
53+
and runs a few simple tests to ensure that the program works and generates the expected output.
54+
55+
If a pre-compiled package is used instead of compiling the program from source, the `run_tests.sh` script is included in the binary package as well and it can be run immediately after unpacking the binary package:
56+
57+
```
58+
tar -xvzf stringtie-2.0.Linux_x86_64.tar.gz
59+
cd stringtie-2.0.Linux_x86_64
60+
./run_tests.sh
61+
```
62+
63+
These small test/demo data sets can also be downloaded separately as <a href="https://github.com/mpertea/stringtie2/releases/download/v2.0/test_data.tar.gz">test_data.tar.gz</a> along with the source package and pre-compiled packages on the <a href="https://github.com/mpertea/stringtie2/releases">Releases</a> page for this repository.
64+
65+
The tests can also be run manually as shown below (after changing to the _test_data_ directory, `cd test_data`):
66+
67+
#### Run 1: Input consists of only alignments of short reads
68+
```
69+
stringtie -o short_reads.out.gtf short_reads.bam
70+
```
71+
72+
#### Run 2: Input consists of alignments of short reads and superreads
73+
```
74+
stringtie -o short_reads_and_superreads.out.gtf short_reads_and_superreads.bam
75+
```
76+
77+
#### Run 3: Input consists of alignments of long reads
78+
```
79+
stringtie -L -o long_reads.out.gtf long_reads.bam
80+
```
81+
82+
#### Run 4: Input consists of alignments of long reads and reference annotation (guides)
83+
```
84+
stringtie -L -G human-chr19_P.gff -o long_reads_guided.out.gtf long_reads.bam
85+
```
86+
87+
The above runs should take around one second each on a regular Linux or MacOS desktop.
88+
(see also <a href="https://github.com/mpertea/stringtie2/blob/master/test_data/README.md">test_data/README.md</a>).
89+
90+
For very large data sets one can expect up to one hour of processing time. A minimum of 8GB of RAM is recommended for running StringTie on regular size RNA-Seq samples, with 16 GB or more being strongly advised for larger data sets.
91+
92+
93+
### StringTie options
94+
4995
The following optional parameters can be specified (use -h/--help to get the
5096
usage message):
5197
```
@@ -134,3 +180,28 @@ _`reference_id`_ GTF attribute in the output file . Other transcripts assembled
134180
the input alignment data by StringTie and not present in the reference file will be
135181
printed as well ("novel" transcripts).
136182

183+
## The super-reads module
184+
185+
This optional module can be used to de-novo assemble, align and pre-process
186+
RNA-Seq reads, preparing them to be used as "super-reads" by Stringtie.
187+
188+
Mode detailed information is provided in the
189+
<a href="https://github.com/mpertea/stringtie2/blob/master/SuperReads_RNA/README.md">SuperReads_RNA/README.md</a>.
190+
Quick installation instructions for this module from the source available on this repository
191+
(assuming the above Stringtie installation was completed):
192+
193+
```
194+
cd SuperReads_RNA
195+
./install.sh
196+
```
197+
198+
### Using super-reads with Stringtie
199+
200+
After running the super-reads module (see the <a href="https://github.com/mpertea/stringtie2/blob/master/SuperReads_RNA/README.md">SuperReads_RNA</a> module documentation for usage details), there
201+
is a BAM file which contains sorted alignment for both short reads and super-reads, called *`sr_merge.bam`*,
202+
created in the selected output directory. This file can be directly given as the main input file
203+
to StringTie as described in the [Running StringTie](#running-stringtie) section above.
204+
205+
206+
## License
207+
StringTie is free, open source software released under an <a href="https://opensource.org/licenses/MIT">MIT License</a>.

0 commit comments

Comments
 (0)