You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/detailsInput.rst
+8-15Lines changed: 8 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,24 +30,23 @@ The p-value threshold for D\ :sub: `max` is per default set to 0.01. In our benc
30
30
31
31
Flag -k: dbSNP database (dbSNPs_sorted.txt.gz)
32
32
===============================================
33
-
To identify TFs which are more often affected than expected by chance in the given input SNP set, SNEEP can perform a statistical assessment to compare the result against proper random controls. To do so, the pipeline randomly samples SNPs from the `dbSNP database <??>`_ and rerun the analysis on these SNPs.
33
+
To identify TFs which are more often affected by the given data than one would expected on random data, SNEEP can perform a statistical assessment to compare the result against proper random controls. To do so, the pipeline randomly samples SNPs from the `dbSNP database <https://www.ncbi.nlm.nih.gov/snp/>`_ and rerun the analysis on these SNPs.
34
34
In order to sample the SNPs in a fast and efficient manner, we provide a file (in our `Zenodo repository <https://zenodo.org/record/4892591>`_ containing the SNPs of the dbSNP database. The file is a slightly modified version of the `public available one <ttps://ftp.ncbi.nlm.nih.gov/snp/latest_release/VCF/>`_ (file GCF_000001405.38). In detail, we
35
35
36
36
- removed all SNPs overlapping with a protein-coding region (annotation of the `human genome (GRCh38), version 36 (Ensembl 102) <https://www.gencodegenes.org/human/release_36.html>`_), (TODO: remove this sentence when zenodo dir is updated!)
37
37
- removed all information not important for SNEEP,
38
38
- removed mutations longer than 1 bp,
39
39
- and sorted SNPs according to their MAF distribution in ascending order.
40
40
41
-
42
41
Flag -r and -g: Epigenetic interactions
43
42
===============================================
44
-
We provide three files (in our `Zenodo repository <??>`_) containing epigenetic interactions associated to target genes:
43
+
We provide three files (in our `Zenodo repository <https://zenodo.org/record/4892591>`_) containing epigenetic interactions associated to target genes:
45
44
46
45
- interactionsREMs.txt provides regulatory elements (REMs) linked to their target genes. The data was derived with the STITCHIT algorithm, which is a peak-calling free approach to identify gene-specific REMs by analyzing epigenetic signal of diverse human cell types with regard to gene expression of a certain gene. For more information, you can also have a look at our public `EpiRegio database <https://epiregio.de>`_ holding all REMs stored in the interactionsREMs.txt file.
47
46
- interactionsREM_PRO.txt: Additional to the REMs the promoters (+/- 500 bp around TSS) of the genes are included as regions linked to their target genes.
48
47
- interactionsREMs_PRO_HiC.txt: This file further includes enhancer-gene links predicted with the ABC algorithm on human heart data from a `published paper from Anene-Nzelu *et al.* <https://www.ahajournals.org/doi/10.1161/CIRCULATIONAHA.120.046040?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%20%200pubmed>`_.
49
48
50
-
It is also possible to use your own epigenetic interactions file or extend on of ours with for instance cell type specific data. Please stick to our tab-separated format:
49
+
It is also possible to use your own epigenetic interactions file (for instance generated with STARE's gABC score computation) or extend one of ours with for instance cell type specific data. Please stick to our tab-separated format:
51
50
52
51
- chr of the linked region
53
52
- start of the linked region (0-based)
@@ -57,15 +56,9 @@ It is also possible to use your own epigenetic interactions file or extend on of
57
56
- 7 tab-separated dots (or additional information which you wish to keep -> displayed in the result.txt file but not in the summary pdf).
58
57
59
58
Further, a file which provides a mapping between ensemblID to gene name must be given. This file comes along with our GitHub repository.
60
-
61
-
Flag -s: Estimated scale parameters for the TFs used
Our modified Laplace distribution is dependent on two parameters: n, which is two times the length of the TF model and b, which needs to be estimated.
65
-
For the TF set we provide within our GitHub repository, we also estimated the scale parameter listed in XX.
66
-
In case a customized TF motif set is used, one also needs to estimate the scale parameter for each TF. Therefore we provide a script XXX (TODO: provide more details here).
67
60
68
-
Flag -a: Store Dmax values for all considered shifts
61
+
Flag -a: Store D\ :sub: `max` values for all considered shifts
If this flag is set, for all shifts that exceed the TF binding affinity p-value threshold the resulting D-max value and the corresponding p-value is stored in <outputDir>/AllDiffBindAffinity.txt
71
64
@@ -74,7 +67,7 @@ Flag -f: Include open chromatin data
74
67
75
68
To consider only the SNPs which overlap with cell type specific open chromatin data, a peak file in bed-format can be specified with this flag.
76
69
77
-
Flag -m: Get all Dmax values
70
+
Flag -m: Get all D\ :sub: `max` values
78
71
===============================
79
72
80
73
If this flag is set all absolute maximal differential TF binding scores are printed (to the console) even if they do not exceed the specified p-value threshold. This flag is useful for estimating the scale parameter
@@ -86,12 +79,12 @@ In order to only consider the TFs which are expressed in your analysed cell type
86
79
Flag -j: Number of sampled background SNP sets
87
80
=================================================
88
81
89
-
With this flag the number of background rounds can be specified, default 0.
82
+
With this flag the number of background rounds can be specified. Default: -j 0.
90
83
91
84
Flag -l: Reproducible results for random background analysis
0 commit comments