Skip to content

Commit

Permalink
Merge pull request #17 from phac-nml/release-0.1.0
Browse files Browse the repository at this point in the history
Release 0.1.0
  • Loading branch information
apetkau authored Aug 20, 2024
2 parents a782799 + c20deab commit f2250fa
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 5 deletions.
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/)
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.1.0]
## [0.1.0] - 2024-08-19

Initial release of the arboratornf pipeline to be used for running [Arborator](https://github.com/phac-nml/arborator) under Nextflow.

Expand Down
20 changes: 17 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ The columns of the samplesheet are defined as follows:

The names of each metadata column (metadata_partition, and metadata_1..metadata_8) are provided using the following parameters:

- metadata_partition_name: The name of the metadata_partition column (for example: "outbreak").
- metadata_1_header..metadata_8_header: The name of each individual metadata column (for example: "organism" or "source").
- `--metadata_partition_name`: The name of the metadata_partition column (for example: "outbreak").
- `--metadata_1_header..metadata_8_header`: The name of each individual metadata column (for example: "organism" or "source").

Entries in the `metadata_partition` column in the sample sheet, as well as the name provided by the `metadata_partition_name` parameter, must contain only the following characters alphanumeric, `_`, `.`, and `-` characters.

Expand All @@ -36,6 +36,15 @@ Furthermore, the structure of the sample sheet is programmatically defined in [a

The mandatory parameters are `--input`, which specifies the samplesheet as described above, and `--output`, which specifies the output results directory. You may wish to provide `-profile singularity` to specify the use of singularity containers and `-r [branch]` to specify which GitHub branch you would like to run. Metadata-related parameters are described above in [Input](#input).

## Optional

The optional parameters are as follows:

### Arborator

- `--ar_config`: The Arborator-specific config file for specifying the operations used when summarizing metadata and how such metadata should be displayed in the output.
- `--ar_thresholds`: The clustering thresholds used by Arborator. These thresholds must be provided as a list of integers.

Further parameters (defaults from nf-core) are defined in [nextflow_schema.json](nextflow_schema.json).

# Running
Expand All @@ -48,6 +57,8 @@ nextflow run phac-nml/arboratornf -profile singularity -r main -latest --input a

Where the `samplesheet.csv` is structured as specified in the [Input](#input) section and `parameters.yaml` provides parameters for renaming metadata column headers, which may either be specified individually on the command line or collectively in a parameters file.

Additional details on usage of the pipeline are found in [docs/usage.md](docs/usage.md).

# Output

A JSON-formatted file for loading metadata into IRIDA Next is output by this pipeline. The format of this JSON-formatted file is specified in our [Pipeline Standards for the IRIDA Next JSON](https://github.com/phac-nml/pipeline-standards#32-irida-next-json). This JSON-formatted file is written directly within the `--outdir` provided to the pipeline with the name `iridanext.output.json.gz` (ex: `[outdir]/iridanext.output.json.gz`).
Expand All @@ -66,7 +77,8 @@ An example of the what the contents of the IRIDA Next JSON-formatted file looks
},
{
"path": "arborator/cluster_summary.tsv"
}
},
// ...
],
"samples": {
Expand All @@ -84,6 +96,8 @@ Within the `files` section of this JSON-formatted file, all of the output paths

The `arborator/metadata.included.tsv` and `arborator/metadata.excluded.tsv` output files summarize which samples were analyzed and which were not. Samples that contain missing data for the `metadata_partition` column will not be included in analysis and will be reported in the `arborator/metadata.excluded.tsv` output file.

Additional details on output files are found in [docs/output.md](docs/output.md).

## Test profile

To run with the test profile, please do:
Expand Down
2 changes: 1 addition & 1 deletion tests/pipelines/main.nf.test
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
nextflow_pipeline {
name "Integration Tests for Cluster Splitting"
name "Integration Tests for Arborator pipeline"
script "main.nf"

test("Small-scale test of full pipeline"){
Expand Down

0 comments on commit f2250fa

Please sign in to comment.