You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+8-1Lines changed: 8 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
#TEPIC (version 2.1)
1
+
#TEPIC (version 2.2)
2
2
-------
3
3
TEPIC offers workflows for the prediction and analysis of Transcription Factor (TF) binding sites including:
4
4
* TF affinity computation in user provided regions
@@ -10,6 +10,8 @@ A graphical overview on the workflows of TEPIC is shown below. Blue font indicat
10
10

11
11
12
12
## News
13
+
08.10.2019: We present a novel feature to include TFBS in regulatory sites determined by chromatin conformation capture data. Using an extended feature space representation, the INVOKE model can investigate the regulatory influence of TFs bound to promoters and enhancers separately.
14
+
13
15
10.10.2018: TEPIC 2.0 is now published in [Bioinformatics](https://doi.org/10.1093/bioinformatics/bty856).
14
16
15
17
13.08.2018: In addition to the gene-centric annotation, the functionality for transcript based annotation has been added.
@@ -156,6 +158,11 @@ Here, thresholded TF affinities are used for the computation.
The *Prefix_Conformation_Data_Affinity_Three_Peak_Based_Features_Gene_View.txt* files are based on the previous structure but extend it by including the same features, that is TF gene-scores and peak features determined for DHS residing in chromatin loops:
162
+
163
+
GENEID TF1 TF2 ... TFn peak length peak count peak signal LR_TF1 ... LR_TFn LR_peak length LR_peak count LR_peak signal
The *Prefix_Thresholded_Sparse_Affinity_Gene_View.txt* files are tab separated files listing the Ensemble GeneID in the first column, and the name of the TF associated to this gene in the second column.
160
167
Here, thresholded TF affinities are used for the computation. The third column of this file is required by DREM and does not carry any specific meaning.
Furthermore, TEPIC can compute a TF-specific affinity cut-off derived from either user-defined, or randomly generated sequences, to distinguish likely bound sites from unbound sites. These scores
200
201
can be used to come-up with a binary TF-gene assignment. Further details on this mode are provided in Section \ref{EPIC-DREM}.
With version $2.2$ of TEPIC, we introduced support for the inclusion of long range chromatin conformation capture data. In addition to the promoter centric windows used before, we
215
+
calculate TF affinities $a_{g,i}*$ and peak scores $pl_g*, pc_g*, ps_g*$ for all DHSs residing in genomic loci looping into the promoter region of a gene, summarizied in $P_{g,V_g}$, where $V_g$ is the set of all regions looped into the promoter region of gene $g$:
216
+
\begin{align}
217
+
a_{g,i}&=\sum_{p \in P_{g,V_g}} a_{p,i},\\
218
+
pl_g*&=\sum_{p \in P_{g,V_g}}|p|, \\
219
+
pc_g*&=\sum_{p \in P_{g,V_g}}, \\
220
+
ps_g*&=\sum_{p \in P_{g,V_g}}s_{p}.
221
+
\end{align}
222
+
Note that scores computed for $p \in P_{g,V_g}g$ are never considering the exponential decay as a direct interaction of the respective sites with the promoter region of gene $g$ has been determined by chromatin conformation capture experiments.
223
+
214
224
\subsection{Required input}
215
-
To compute TF gene scores a user needs to specify:
225
+
To compute TF gene scores, a user needs to specify:
0 commit comments