Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SISTR v1.1.3 release #59

Merged
merged 40 commits into from
Nov 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
d6aa19d
first SISTR v1.1.3 commit with serovar around O24/25 antigen update
kbessonov1984 Aug 29, 2024
aa2f217
fixed empty subspecies value for reverse antigen to serovar lookup fe…
kbessonov1984 Aug 30, 2024
77f0896
added Database section to README
kbessonov1984 Aug 30, 2024
f76294e
CI tests, list of serovars -l option
kbessonov1984 Sep 4, 2024
fd6a077
CI tests running on pytest
kbessonov1984 Sep 4, 2024
dc44034
pycurl added in CI and fixed regex for O24 and O25
kbessonov1984 Sep 4, 2024
329a6ba
pycurl added in CI and fixed regex for O24 and O25
kbessonov1984 Sep 4, 2024
745949a
pycurl added in CI and fixed regex for O24 and O25
kbessonov1984 Sep 4, 2024
58e98bb
python 3.10 for CI
kbessonov1984 Sep 4, 2024
b575d67
python 3.10 for CI
kbessonov1984 Sep 4, 2024
34af179
python 3.10 for CI
kbessonov1984 Sep 4, 2024
acc0e77
python 3.10 for CI
kbessonov1984 Sep 4, 2024
8836d5c
numpy no upper limit requirement
kbessonov1984 Sep 4, 2024
a5fed55
pycurl more relaxed conditions
kbessonov1984 Sep 4, 2024
43925cd
building
kbessonov1984 Sep 4, 2024
819baff
building
kbessonov1984 Sep 4, 2024
9e62be5
numpy <2
kbessonov1984 Sep 4, 2024
0f84165
numpy <2
kbessonov1984 Sep 4, 2024
635f1d0
Databases init more transparent
kbessonov1984 Sep 4, 2024
b451c3c
Databases init more transparent
kbessonov1984 Sep 4, 2024
e43911a
Databases init more transparent
kbessonov1984 Sep 4, 2024
0d31550
pre-release v1.1.3 commit tested against test dataset with O24 and O2…
kbessonov1984 Sep 4, 2024
1a28209
Updated README.md with urls to Databases on Zenodo.org and constants.…
kbessonov1984 Sep 5, 2024
7b6223f
Updated the CHANGELOG.md and README.md
kbessonov1984 Sep 6, 2024
7aeb28b
The O24 and O25 collapsed serovars antigenic formula now corrected in…
kbessonov1984 Sep 23, 2024
f1f5c33
serovar list feature unit tested and new tests written, changelog upd…
kbessonov1984 Sep 25, 2024
1bb4d55
fixed 'sistr_cmd/sistr/data' directory initalization for database ini…
kbessonov1984 Sep 25, 2024
df32c12
Updaetd the list-of-serovars field and fixed some log level messages
kbessonov1984 Sep 26, 2024
0956002
updated README with updated usage data
kbessonov1984 Sep 26, 2024
c78e1ae
updated README with updated usage data
kbessonov1984 Sep 26, 2024
3d4f56e
Updated README.md and CHANGELOG.md
kbessonov1984 Oct 1, 2024
5d9744f
Updated README.md and CHANGELOG.md
kbessonov1984 Oct 1, 2024
929bbbe
Updated README.md and CHANGELOG.md
kbessonov1984 Oct 1, 2024
aab046f
Updated README.md and CHANGELOG.md
kbessonov1984 Oct 1, 2024
3673f88
Updated serovar list log message to indicate which list is used for a…
kbessonov1984 Nov 22, 2024
dc88a91
improved IIIb O:61:k:1,5,7 reporting by removing O: from overall sero…
kbessonov1984 Nov 22, 2024
0eacf75
Added d-tartrate qc message for Paratyphi B, Paratyphi B var. Java an…
kbessonov1984 Nov 26, 2024
a02725e
fixed test_serotyping - KeyError: 'serovar_in_serovar_list' error
kbessonov1984 Nov 26, 2024
8430801
Update CHANGELOG.md
kbessonov1984 Nov 26, 2024
1a02167
Updated CHANGELOG.md
kbessonov1984 Nov 26, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions .github/workflows/github-actions.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# This workflow will install Python dependencies, run tests and lint with a single version of Python
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python

name: Python application

on:
push:
branches: [ "master", "v1.1.3" ]
pull_request:
branches: [ "master", "v1.1.3" ]

permissions:
contents: read

jobs:
build:

runs-on: ubuntu-22.04

steps:
- uses: actions/checkout@v4
- name: Set up Python 3.10
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install mash ncbi-blast+ libssl-dev libcurl4-openssl-dev mafft libssl-dev ca-certificates -y
sudo apt-get install python3-pip python3-dev python3-biopython -y
python3 -m pip install --upgrade pip setuptools
pip3 install pytest fastcluster openpyxl pycurl pandas scipy "numpy<2"
python setup.py install
sistr_init
- name: Test with pytest
run: |
pytest -o log_cli=true --basetemp=tmp-pytest
124 changes: 124 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,127 @@
# 1.1.3

Serovar nomenclature update after USA Cantaloupe Outbreaks in November 2023. The O24 and O25 antigens would not be wet-lab typed reliably causing the collapse of certain serovar pairs detailed below (Table 1). The selected serovar in the pair is the one that will be reported by SISTR and the other serovar in the pair will be dropped. No O24 or O25 will be reported in the antigenic formula (Table 2).

<h3>Table 1 - serovar pairs that were collapsed</h3>

|Serovar pair | Serovar selected in v1.1.3 |
|------------------------|----------------------------|
|Soahanina - Sundsvall | Sundsvall |
|Martonos - Finkenwerder | Finkenwerder |
|Midway - Florida | Florida |
|Lindern - Charity | Charity |
|Bahrenfeld - Onderstepoort | Onderstepoort |
|Schalkwijk - Moussoro | Schalkwijk |
|Amberg - Boecker | Boecker |
| Carrau - Madelia | Carrau |
| Chichiri - Uzaramo | Uzaramo |
| Poano - Stafford | Poano |

### Changes of serovar assignments in `sistr/data/genomes-to-serovar.txt` file

|genome accession | serovar previous | serovar current |
|-----------------|------------------|-----------------|
| SAL_DA9822AA | Soahanina |Sundsvall
| SRR1815423 | Soahanina | Sundsvall
| SRR2889947 | Soahanina | Sundsvall
| SRR2889992 | Soahanina | Sundsvall
| SRR3996854 |Soahanina | Sundsvall
| SRR3669910 |Soahanina | Sundsvall
| SRR3732330 |Soahanina | Sundsvall
|SRR3713652 | Soahanina | Sundsvall
| SRR3713653 |Soahanina | Sundsvall
|SRR3978444 |Soahanina | Sundsvall
| SRR2011392 |Soahanina | Sundsvall
| SRR1068363 |Soahanina | Sundsvall
|ERR161888 |Soahanina | Sundsvall
|SAL_BA5034AA |Soahanina | Sundsvall
|SAL_EA3233AA |Soahanina | Sundsvall
|SAL_GA9094AA |Soahanina | Sundsvall
|SRR1158155 |Soahanina | Sundsvall
|SRR2751907 |Soahanina | Sundsvall
|SRR4237685 |Soahanina | Sundsvall
|SRR5010548 |Soahanina | Sundsvall
|09_6055 |Madelia | Carrau
|11_0879 |Madelia | Carrau
|SAL_BA1830AA |Madelia | Carrau
|SAL_CA7979AA |Madelia | Carrau
|SAL_DA4289AA |Madelia | Carrau
|SAL_DA7475AA |Madelia | Carrau
|SAL_EA4948AA |Madelia | Carrau
|SAL_FA5821AA |Madelia | Carrau
|SAL_HA4780AA |Madelia | Carrau
|SAL_HA4886AA |Madelia | Carrau
|SRR1269415 |Madelia | Carrau
|SRR1548430 |Madelia | Carrau
|SRR1805645| Madelia | Carrau
|SRR2104612 |Madelia | Carrau
|SRR2911800 |Madelia | Carrau
|SRR3933147 |Madelia | Carrau
|SRR4098716 |Madelia | Carrau
|SRR1258654 |Madelia | Carrau
|SRR1582141 |Madelia | Carrau
|SRR4019409 |Madelia | Carrau
|SRR4244476 |Madelia | Carrau
|SRR2075023 |Madelia | Carrau
|SRR5132365 |Madelia | Carrau
|SRR5051381 |Madelia | Carrau
|SRR3743984 |Madelia | Carrau
|SRR5054238 |Madelia | Carrau
|SRR3928735 |Madelia | Carrau
|SRR1586586 |Madelia | Carrau
|SRR2976043 |Madelia | Carrau
|SRR2962333 |Madelia | Carrau
|SRR3928732 |Madelia | Carrau
|SRR3928736 |Madelia | Carrau
|SRR2962332 |Madelia | Carrau
|SAL_EA2874AA |Bahrenfeld | Onderstepoort
|SAL_FA0525AA |Bahrenfeld | Onderstepoort
|SRR3173783 |Bahrenfeld | Onderstepoort
|SAL_DA7014AA |Martonos | Finkenwerder
|SRR1300569 |Martonos | Finkenwerder
|SRR1973814 |Martonos | Finkenwerder

### Changes to `Salmonella-serotype_serogroup_antigen_table-WHO_2007.csv` antigen to serovar lookup database
Removed the following entries
1. Martonos,"6,14,24",d,"1,5",,H,FALSE,enterica
2. Midway,"6,14,24",d,"1,7",,H,FALSE,enterica
3. Lindern,"6,14,[24]",d,"e,n,x",,H,FALSE,enterica
4. Bahrenfeld,"6,14,[24]","e,h","1,5",,H,FALSE,enterica
5. Moussoro,"1,6,14,25",i,"e,n,z15",,H,FALSE,enterica
6. Amberg,"6,14,24","l,v","1,7",,H,FALSE,enterica
7. Madelia,"1,6,14,25",y,"1,7",,H,FALSE,enterica
8. Soahanina,"6,14,24",z,"e,n,x",,H,FALSE,enterica
9. Chichiri,"6,14,24","z4,z24",-,,H,TRUE,enterica
10. II 4:a:z39,"1,4,12,[27]",a,z39,,B,FALSE,salamae

The following entries were modified in the in the `O_antigen` field as such

<h3>Table 2 - updated antigenic formulas for the O24-25 serovars</h3>

| Before | After (SISTR v1.1.3)|
|--------|-------|
|Sundsvall,"[1],6,14,[<b>25</b>]",z,"e,n,x",,H,FALSE,enterica| Sundsvall,"6,14",z,"e,n,x",,H,FALSE,enterica |
|Finkenwerder,"[1],6,14,[<b>25</b>]",d,"1,5",,H,FALSE,enterica | Finkenwerder,"6,14",d,"1,5",,H,FALSE,enterica |
|Florida,"[1],6,14,[<b>25</b>]",d,"1,7",,H,FALSE,enterica | Florida,"6,14",d,"1,7",,H,FALSE,enterica |
| Charity,"[1],6,14,[<b>25</b>]",d,"e,n,x",,H,FALSE,enterica | Charity,"6,14",d,"e,n,x",,H,FALSE,enterica |
| Onderstepoort,"1,6,14,[<b>25</b>]","e,h","1,5",,H,FALSE,enterica | Onderstepoort,"6,14","e,h","1,5",,H,FALSE,enterica |
| Schalkwijk,"6,14,[<b>24</b>]",i,"e,n,z15",,H,FALSE,enterica | Schalkwijk,"6,14",i,"e,n,z15",,H,FALSE,enterica |
| Boecker,"[1],6,14,[<b>25</b>]","l,v","1,7",,H,FALSE,enterica |Boecker,"6,14","l,v","1,7",,H,FALSE,enterica |
| Carrau,"6,14,[<b>24</b>]",y,"1,7",,H,FALSE,enterica | Carrau,"6,14",y,"1,7",,H,FALSE,enterica |
| Uzaramo,"1,6,14,<b>25</b>","z4,z24",-,,H,TRUE,enterica | Uzaramo,"6,14","z4,z24",-,,H,TRUE,enterica |
| Poano,"[1],6,14,[<b>25</b>]",z,"l,z13,z28",,H,FALSE,enterica | Poano,"6,14",z,"l,z13,z28",,H,FALSE,enterica |

### New output field `antigenic_formula`
- Added `antigenic_formula` field that aggregates the O, H1 and H2 antigen values in a single location for convenience

### New argument `--list-of-serovars`
- Added `--list-of-serovars` option allowing user to provide a single column text file listing all serovars of interest to match against the SISTR prediction. The result will be reported in `predicted_serovar_in_list` field as `Y` or `N` if there is match or otherwise. This could be useful for cases when only a certain list of serovars could be reported

### New d-tartrate message for `Paratyphi B`, `Paratyphi B var. Java` and`I 1,4,[5],12:i:-` serovars
- If Paratyphi B and Paratyphi B var. Java serovar is predicted and the `--qc` is selected, the following message will appear in `qc_messages` field `Perform d-tartrate test (dT) to differentiate between Paratyphi B and Paratyphi B var. Java. The dT+ result is indicative of variant Java.`
- If monophasic `I 1,4,[5],12:i:-` predicted, then the `qc_messages` field will suggest d-tartrate test via this message
`Perform d-tartrate test (dT) as both dT+ and dT- I 1,4,[5],12:i:- subtypes exist.`

# 1.1.1

* Fixed issue with sorting of BLAST results (causing cgMLST types to be different between BLAST versions). Pull request #43.
Expand Down
103 changes: 61 additions & 42 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,15 @@
:target: https://sistr-app.onrender.com/
:alt: web app deployed on Render.com Cloud Hosting

Serovar predictions from whole-genome sequence assemblies by determination of antigen gene and cgMLST gene alleles using BLAST.
*Salmonella* serovar predictions from whole-genome sequence assemblies by determination of antigen gene and cgMLST gene alleles using BLAST.
`Mash MinHash <https://mash.readthedocs.io/en/latest/>`_ can also be used for serovar prediction.

.. epigraph::

`Latest stable version <https://github.com/phac-nml/sistr_cmd/releases/latest>`_


*Don't want to use a command-line app?* Try the `SISTR web app <https://github.com/phac-nml/sistr_cmd#web-application>`_ deployed on Galaxy and Render.com platforms
*Don't want to use a command-line app?* Try SISTR with interface deployed on Galaxy and Render.com online platforms (see the `Web application`_ section)


Citation
Expand Down Expand Up @@ -64,7 +64,7 @@ Installation
============

Using Conda [Recommended]
-----------
---------------------------

You can install ``sistr_cmd`` using `Conda <https://conda.io/miniconda.html>`_ from the `BioConda channel <https://bioconda.github.io/>`_:

Expand Down Expand Up @@ -115,10 +115,19 @@ SISTR can be publically accessed as a web application via:

- Galaxy EU instance at https://usegalaxy.eu/root?tool_id=sistr_cmd |galaxy|
- Render.com Cloud Hosting Platform-as-a-Service (PaaS) hosts a **DEMO** SISTR web application https://sistr-app.onrender.com/ |render|
**NOTE:** The SISTR web application hosted on Render.com might take up to 20 seconds to load on the first run and will shutdown after 15 min of inactivity

SISTR web application source code is available at https://github.com/phac-nml/sistr-web-app allowing easy web interface deployment on any infrastructure types (on-premises, cloud/remote).
**NOTE 1:** The SISTR web application hosted on Render.com might take up to 20 seconds to load on the first run and will shutdown after 15 min of inactivity

**NOTE 2:** SISTR web application source code is available at https://github.com/phac-nml/sistr-web-app allowing easy web interface deployment on any infrastructure types (on-premises, cloud/remote).


Database
=========
SISTR will automatically initialize database of *Salmonella* serovar determination antigens, cgMLST profiles and MASH sketch of reference genomes by downloading it from a remote location.
The SISTR database v1.3 got minor updates by collapsing some of the serovars with O24/O25 antigens detailed in `CHANGELOG.md <CHANGELOG.md>`_ file

- SISTR v1.1 database is available at https://zenodo.org/records/13618515 or via a direct url https://zenodo.org/records/13618515/files/SISTR_V_1.1_db.tar.gz?download=1 (used with SISTR < 1.1.3 )
- SISTR v1.3 database is available at https://zenodo.org/records/13693495 or va a direct url https://zenodo.org/records/13693495/files/SISTR_V_1.1.3_db.tar.gz?download=1 (used with SISTR >= 1.1.3)


Dependencies
Expand All @@ -129,7 +138,7 @@ These are the external dependencies required for ``sistr_cmd``:
- Python (>= v2.7 OR >= v3.4)
- BLAST+ (>= v2.2.30)
- MAFFT (>=v7.271 (2016/1/6))
- `Mash v1.0+ <https://github.com/marbl/Mash/releases>`_ [optional]
- `Mash v2.0+ <https://github.com/marbl/Mash/releases>`_ [optional]

Python Dependencies
-------------------
Expand Down Expand Up @@ -167,7 +176,7 @@ If you run ``sistr -h``, you should see the following usage info:
Serovar predictions from whole-genome sequence assemblies by determination of antigen gene and cgMLST gene alleles using BLAST.

Note about using the "--use-full-cgmlst-db" flag:
The "centroid" allele database is ~10% the size of the full set so analysis is much quicker with the "centroid" vs "full" set of alleles. Results between 2 cgMLST allele sets should not differ.
The "centroid" allele database is ~10% the size of the full set so analysis is much quicker with the "centroid" vs "full" set of alleles. Results between 2 cgMLST allele sets should not differ.

If you find this program useful in your research, please cite as:

Expand Down Expand Up @@ -210,12 +219,16 @@ If you run ``sistr -h``, you should see the following usage info:
serovar prediction results.
-t THREADS, --threads THREADS
Number of parallel threads to run sistr_cmd analysis.
-l LIST_OF_SEROVARS, --list-of-serovars LIST_OF_SEROVARS
A path to a single column text file containing list of
serovar(s) to check serovar prediction against. Result
reported in the "predicted_serovar_in_list"
field as Y (present) or N (absent) value.
-v, --verbose Logging verbosity level (-v == show warnings; -vvv ==
show debug info)
-V, --version show program's version number and exit



Example Usage
-------------

Expand Down Expand Up @@ -279,32 +292,11 @@ Summary of output options:


Primary results output (``-o sistr-results``)
------------------------------------------

Tab-delimited results output (``-f tab``):

.. code-block:: tab

cgmlst_ST cgmlst_distance cgmlst_genome_match cgmlst_matching_alleles cgmlst_subspecies fasta_filepath genome h1 h2 o_antigen qc_messages qc_status serogroup serovar serovar_antigen serovar_cgmlst
660408169 0.00909090909091 LT2 327 enterica /home/peter/Downloads/sistr-LT2-example/LT2.fasta LT2 i 1,2 1,4,[5],12 PASS B Typhimurium Typhimurium Typhimurium

CSV results output (``-f csv``):

.. code-block:: csv

cgmlst_ST,cgmlst_distance,cgmlst_genome_match,cgmlst_matching_alleles,cgmlst_subspecies,fasta_filepath,genome,h1,h2,o_antigen,qc_messages,qc_status,serogroup,serovar,serovar_antigen,serovar_cgmlst
660408169,0.00909090909091,LT2,327,enterica,/home/peter/Downloads/sistr-LT2-example/LT2.fasta,LT2,i,"1,2","1,4,[5],12",,PASS,B,Typhimurium,Typhimurium,Typhimurium

How the results should look in a table:

.. csv-table::

cgmlst_ST,cgmlst_distance,cgmlst_genome_match,cgmlst_matching_alleles,cgmlst_subspecies,fasta_filepath,genome,h1,h2,o_antigen,qc_messages,qc_status,serogroup,serovar,serovar_antigen,serovar_cgmlst
660408169,0.00909090909091,LT2,327,enterica,/home/peter/Downloads/sistr-LT2-example/LT2.fasta,LT2,i,"1,2","1,4,[5],12",,PASS,B,Typhimurium,Typhimurium,Typhimurium


JSON results output:
---------------------------------------------
SISTR supports various text output formats specified by the ``-f`` option with ``json`` being the default.

JSON results output (``-f json``):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. code-block:: json

[
Expand All @@ -328,6 +320,32 @@ JSON results output:
}
]


Tab-delimited results output (``-f tab``):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: text

cgmlst_ST cgmlst_distance cgmlst_genome_match cgmlst_matching_alleles cgmlst_subspecies fasta_filepath genome h1 h2 o_antigen qc_messages qc_status serogroup serovar serovar_antigen serovar_cgmlst
660408169 0.00909090909091 LT2 327 enterica /home/peter/Downloads/sistr-LT2-example/LT2.fasta LT2 i 1,2 1,4,[5],12 PASS B Typhimurium Typhimurium Typhimurium

CSV results output (``-f csv``):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Raw ``csv`` output results opened in a text editor

.. code-block:: csv

cgmlst_ST,cgmlst_distance,cgmlst_genome_match,cgmlst_matching_alleles,cgmlst_subspecies,fasta_filepath,genome,h1,h2,o_antigen,qc_messages,qc_status,serogroup,serovar,serovar_antigen,serovar_cgmlst
660408169,0.00909090909091,LT2,327,enterica,/home/peter/Downloads/sistr-LT2-example/LT2.fasta,LT2,i,"1,2","1,4,[5],12",,PASS,B,Typhimurium,Typhimurium,Typhimurium

The same ``csv`` results rendered as a table

.. csv-table::

cgmlst_ST,cgmlst_distance,cgmlst_genome_match,cgmlst_matching_alleles,cgmlst_subspecies,fasta_filepath,genome,h1,h2,o_antigen,qc_messages,qc_status,serogroup,serovar,serovar_antigen,serovar_cgmlst
660408169,0.00909090909091,LT2,327,enterica,/home/peter/Downloads/sistr-LT2-example/LT2.fasta,LT2,i,"1,2","1,4,[5],12",,PASS,B,Typhimurium,Typhimurium,Typhimurium


cgMLST allele search results
-------------------------------------

Expand All @@ -337,7 +355,7 @@ These results may be useful for understanding unexpected or low confidence serov
Schema:
~~~~~~~

.. code-block:: json
.. code-block:: text

{
<genome name>: {
Expand Down Expand Up @@ -414,14 +432,15 @@ Schema:
"seq": string
}

}}
}
}

Example:
~~~~~~~~

Here's some truncated example allele search results output:
Here's some truncated example allele search results output in JSON format for ``LT2`` sample:

.. code-block:: json
.. code-block:: text

{
"LT2": {
Expand Down Expand Up @@ -472,7 +491,7 @@ cgMLST allelic profiles output (``--cgmlst-profiles cgmlst-profiles.csv``)
--------------------------------------------------------------------------

With the ``-p``/``--cgmlst-profiles`` commandline argument, you can output the 330 loci cgMLST allelic profiles for your input genomes (i.e. the allele designation for each cgMLST locus for each input genome).
You can use this information to construct phylogenetic trees from this data using a tool such as `Phyloviz Online <https://online.phyloviz.net/index>`_.
You can use this information to construct phylogenetic trees from this data using a tool such as `Phyloviz Online <https://online.phyloviz.net/index>`_ by uploading cgMLST profiles data.
This type of analysis may be useful to explore why unexpected serovar prediction results were generated (e.g. your genomes are genetically very different from each other).

Example truncated cgMLST profiles output:
Expand All @@ -485,13 +504,13 @@ Example truncated cgMLST profiles output:


QC by ``sistr_cmd`` (``--qc``)
-------------------
------------------------------

If you are running ``sistr_cmd`` with the ``--qc`` commandline argument, ``sistr_cmd`` will run some basic QC to determine the level of confidence in the serovar prediction.

The ``qc_status`` field should contain a value of ``PASS`` if your genome passes all QC checks, otherwise, it will be ``WARNING`` or ``FAIL`` if there are issues with your results and/or input genome sequence.

The ``qc_messages`` field will contain useful information about why you may have a low confidence serovar prediction result. The QC messages will be delimited by `` | ``.
The ``qc_messages`` field will contain useful information about why you may have a low confidence serovar prediction result. The QC messages will be delimited by `` | `` symbol.

For example, here are the QC messages for an unusually small *Salmonella* assembly where the predicted serovar was "-:-:-":

Expand All @@ -507,10 +526,10 @@ The QC messages produced by ``sistr_cmd`` should help you understand your serova

Galaxy workflows
================
The `galaxy <https://github.com/phac-nml/sistr_cmd/tree/master/galaxy>`_ folder contains Galaxy Project SISTR workflows that allow to process samples in large batches.
The `galaxy <./galaxy/>`_ folder contains Galaxy SISTR workflows that can be readily imported into existing Galaxy server instance and allow to process WGS samples in large batches starting from raw reads and finishing with serovar results.


- `Galaxy-Workflow-Assembly-Serotyping-withReport-for-SISTR_v1.1.1+galaxy1-recipe.ga <https://github.com/phac-nml/sistr_cmd/tree/master/galaxy/Galaxy-Workflow-Assembly-Serotyping-withReport-for-SISTR_v1.1.1+galaxy1-recipe.ga>`_
- `Galaxy-Workflow-Assembly-Serotyping-withReport-for-SISTR_v1.1.1+galaxy1-recipe.ga <./galaxy/Galaxy-Workflow-Assembly-Serotyping-withReport-for-SISTR_v1.1.1+galaxy1-recipe.ga>`_
+ Summary: Assembles genomes from raw reads, performs serotyping and generates overall report
+ Uses tool dependencies: ``sistr 1.1.1+galaxy1``, ``shovill 1.0.4+galaxy1`` and ``tp_cat 0.1.0``

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def run(self):
'install': CustomInstallCommand
},
install_requires=[
'numpy>=1.11.1,<1.23.5',
'numpy>=1.11.1,<2',
'tables>=3.3.0,<4',
'pandas>=0.22.0,<3',
'pycurl>=7.43.0,<8',
Expand Down
Loading