Skip to content

Commit 8407809

Browse files
committed
update doc
1 parent cc2c8f1 commit 8407809

File tree

2 files changed

+44
-32
lines changed

2 files changed

+44
-32
lines changed

docs/references/index.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,12 @@ Cheng, S., Pei, Y., He, L., Peng, G., Reinius, B., Tam, P.P., Jing, N. and Deng,
2121
reveals cellular heterogeneity of pluripotency transition and X chromosome dynamics during early mouse development.**
2222
*Cell reports*, **26(10)**. `https://doi.org/10.1016/j.celrep.2019.02.031 <https://doi.org/10.1016/j.celrep.2019.02.031>`_
2323

24+
Cosentino, S., Sriswasdi, S., and Iwasaki, W. (2024). **SonicParanoid2: fast, accurate, and comprehensive orthology
25+
inference with machine learning and language models.** *Genome Biology*, **25(1)**. `https://doi.org/10.1186/s13059-024-03298-4 <https://doi.org/10.1186/s13059-024-03298-4>`_
26+
27+
Derelle, R., Philippe, H., and Colbourne, J. K. (2020). **Broccoli: combining phylogenetic and network analyses for
28+
orthology assignment.** *Molecular Biology and Evolution*, **37(11)**.`https://doi.org/10.1093/molbev/msaa159 <https://doi.org/10.1093/molbev/msaa159>`_
29+
2430
Domazet-Loso, T., Brajkovic J. and Tautz D. (2007). **A phylostratigraphy approach to uncover the genomic history of
2531
major adaptations in metazoan lineages.** *Trends in Genetics*, **23(11)**. `https://doi.org/10.1016/j.tig.2007.08.014 <https://doi.org/10.1016/j.tig.2007.08.014>`_
2632

docs/tutorials/orthofinder.rst

Lines changed: 38 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,13 @@
33
Step 0 - run OrthoFinder
44
========================
55

6+
.. note::
7+
In version 0.0.2 from `oggmap` now it is possible to directly extract an orthomap from the following tools:
8+
9+
- `OrthoFinder <https:https://github.com/davidemms/OrthoFinder>`_
10+
- `SonicParanoid2 <https://gitlab.com/salvo981/sonicparanoid2>`_
11+
- `Broccoli <https://github.com/rderelle/Broccoli>`_
12+
613
In order to extract an orthomap from `OrthoFinder <https:https://github.com/davidemms/OrthoFinder>`_ results, one needs to run `OrthoFinder <https:https://github.com/davidemms/OrthoFinder>`_.
714

815
Mandatory OrthoFinder results files
@@ -43,9 +50,9 @@ Install OrthoFinder
4350
OrthoFinder installation using conda
4451
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
4552

46-
::
53+
.. code-block:: bash
4754
48-
conda install -c bioconda orthofinder
55+
conda install -c bioconda orthofinder
4956
5057
Run OrthoFinder
5158
---------------
@@ -56,11 +63,11 @@ The species peptide file can be pre-processed e.g. to just contain the longest i
5663

5764
To extract the longest isoform `oggmap`
5865

59-
::
66+
.. code-block:: bash
6067
61-
wget https://ftp.ensembl.org/pub/release-105/fasta/danio_rerio/cds/Danio_rerio.GRCz11.cds.all.fa.gz
62-
gunzip Danio_rerio.GRCz11.cds.all.fa.gz
63-
oggmap cds2aa -i Danio_rerio.GRCz11.cds.all.fa -r ENSEMBL -o Danio_rerio.GRCz11.aa.all.longest.fa
68+
wget https://ftp.ensembl.org/pub/release-113/fasta/danio_rerio/cds/Danio_rerio.GRCz11.cds.all.fa.gz
69+
gunzip Danio_rerio.GRCz11.cds.all.fa.gz
70+
oggmap cds2aa -i Danio_rerio.GRCz11.cds.all.fa -r ENSEMBL -o Danio_rerio.GRCz11.aa.all.longest.fa
6471
6572
.. warning::
6673
**OrthoFinder by default use diamond as the sequence search engine.** To increase sequence search sensitivity, at least use the '-S diamond_ultra_sens' option.
@@ -73,13 +80,13 @@ To extract the longest isoform `oggmap`
7380

7481
To change the `'config.json' <https://raw.githubusercontent.com/davidemms/OrthoFinder/master/scripts_of/config.json>`_ and the 'diamond_ultra_sens' option from OrthoFinder, please change the 'cofig.json' as follows:
7582

76-
::
83+
.. code-block:: console
7784
78-
"diamond_ultra_sens":{
79-
"program_type": "search",
80-
"db_cmd": "diamond makedb --ignore-warnings --in INPUT -d OUTPUT",
81-
"search_cmd": "diamond blastp --ignore-warnings -k0 -d DATABASE -q INPUT -o OUTPUT --ultra-sensitive -p 1 --quiet -e 0.001 --compress 1"
82-
},
85+
"diamond_ultra_sens":{
86+
"program_type": "search",
87+
"db_cmd": "diamond makedb --ignore-warnings --in INPUT -d OUTPUT",
88+
"search_cmd": "diamond blastp --ignore-warnings -k0 -d DATABASE -q INPUT -o OUTPUT --ultra-sensitive -p 1 --quiet -e 0.001 --compress 1"
89+
},
8390
8491
8592
Use LAST with OrthoFinder
@@ -98,30 +105,30 @@ or you might want to install with bioconda:
98105
To use `last <https://gitlab.com/mcfrith/last>`_ as a new sequence serach engine,
99106
please change the 'config.json' as follows:
100107

101-
::
108+
.. code-block:: console
102109
103-
"last":{
104-
"program_type": "search",
105-
"db_cmd": "lastdb -p -cR01 OUTPUT INPUT",
106-
"search_cmd": "lastal -f BlastTab+ -D 1e6 DATABASE INPUT | sed -n '/^#/!p' > OUTPUT"
107-
},
110+
"last":{
111+
"program_type": "search",
112+
"db_cmd": "lastdb -p -cR01 OUTPUT INPUT",
113+
"search_cmd": "lastal -f BlastTab+ -D 1e6 DATABASE INPUT | sed -n '/^#/!p' > OUTPUT"
114+
},
108115
109116
110117
Typical run command
111118
-------------------
112119

113120
- using diamond
114121

115-
::
122+
.. code-block:: bash
116123
117-
orthofinder -t 32 -a 8 -og -o diamond_output/ -S diamond_ultra_sens -f folder_with_peptides/
124+
orthofinder -t 32 -a 8 -og -o diamond_output/ -S diamond_ultra_sens -f folder_with_peptides/
118125
119126
120127
- using last
121128

122-
::
129+
.. code-block:: bash
123130
124-
orthofinder -t 32 -a 8 -og -o last_output/ -S last -f folder_with_peptides/
131+
orthofinder -t 32 -a 8 -og -o last_output/ -S last -f folder_with_peptides/
125132
126133
127134
Adding a new species to an existing OrthoFinder result
@@ -151,28 +158,27 @@ ORF/CDS extraction can be done with e.g. `TransDecoder <https://github.com/Trans
151158
using `miniprot <https://github.com/lh3/miniprot>`_ with the "newer" annotated peptides followed by `miniprothint <https://github.com/tomasbruna/miniprothint>`_ or
152159
using `GALBA <https://github.com/Gaius-Augustus/GALBA>`_
153160

154-
::
155-
156-
miniprot dd_Smed_v6.pcf.contigs.fasta schmidtea_mediterranea.PRJNA12585.WBPS18.protein.fa --aln > miniprot.aln
157-
miniprot_boundary_scorer -o miniprot_parsed.gff -s blosum62.csv < miniprot.aln
158-
miniprothint.py miniprot_parsed.gff --workdir miniprothint
161+
.. code-block:: bash
159162
163+
miniprot dd_Smed_v6.pcf.contigs.fasta schmidtea_mediterranea.PRJNA12585.WBPS18.protein.fa --aln > miniprot.aln
164+
miniprot_boundary_scorer -o miniprot_parsed.gff -s blosum62.csv < miniprot.aln
165+
miniprothint.py miniprot_parsed.gff --workdir miniprothint
160166
161167
- extract and convert CDS into peptides from the given transcriptome
162168

163169
extraction and direct conversion into peptides can be done with e.g. `gffread <https://github.com/gpertea/gffread>`_
164170

165171
Here, first the original contig IDs are added to the gene IDs so that later a mapping against the scRNA data is possible.
166172

167-
::
173+
.. code-block:: bash
168174
169-
awk -F '\t' -vOFS='\t' '{if($3=="mRNA"){gsub("ID=","ID="$1"::",$9)}; if($3!="mRNA"){gsub("Parent=", "Parent="$1"::", $9)}; print $0}' miniprot_parsed.gff > miniprot_parsed_IDs.gff
170-
gffread -x dd_Smed_v6_miniprot_parsed.x.fasta -y dd_Smed_v6_miniprot_parsed.pep.fasta -g dd_Smed_v6.pcf.contigs.fasta miniprot_parsed_IDs.gff
175+
awk -F '\t' -vOFS='\t' '{if($3=="mRNA"){gsub("ID=","ID="$1"::",$9)}; if($3!="mRNA"){gsub("Parent=", "Parent="$1"::", $9)}; print $0}' miniprot_parsed.gff > miniprot_parsed_IDs.gff
176+
gffread -x dd_Smed_v6_miniprot_parsed.x.fasta -y dd_Smed_v6_miniprot_parsed.pep.fasta -g dd_Smed_v6.pcf.contigs.fasta miniprot_parsed_IDs.gff
171177
172178
Now one can use the extracted peptides with `OrthoFinder <https:https://github.com/davidemms/OrthoFinder>`_ to add them to an existing `OrthoFinder <https:https://github.com/davidemms/OrthoFinder>`_ run.
173179

174180
- Place the new species peptide files in a separate folder
175181

176-
::
182+
.. code-block:: bash
177183
178-
orthofinder -t 32 -a 8 -og -S last -b last_output/Results_Sep13/WorkingDirectory/ -f new_species/
184+
orthofinder -t 32 -a 8 -og -S last -b last_output/Results_Sep13/WorkingDirectory/ -f new_species/

0 commit comments

Comments
 (0)