Skip to content

Commit

Permalink
Add merge HOW TO
Browse files Browse the repository at this point in the history
  • Loading branch information
jfy133 committed Mar 28, 2024
1 parent d535f0f commit 77c2919
Show file tree
Hide file tree
Showing 2 changed files with 55 additions and 22 deletions.
44 changes: 22 additions & 22 deletions docs/source/how_to/convert.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ AMDirT convert --libraries ancientmetagenome-hostassociated_libraries_warinnerli

See [Output](#output) for descriptions of all output files.

> ⚠️ _When using a **pipeline input samplesheet**, you should always double check the sheet is correctly configured. We cannot guarantee accuracy between metadata and sequencing files._
> ⚠️ _When using a **pipeline input samplesheet**, you should always double check the sheet is correctly configured. We cannot guarantee accuracy between metadata and sequencing files._
Once you have validated it, you can directly supply it to the appropriate pipeline as follows (using nf-core/eager as an example):

Expand All @@ -51,33 +51,33 @@ The **citations BibTex** file contains all the citation information of your sele
## Output

> ⚠️ _We highly recommend generating and reviewing `AncientMetagenomeDir_filtered_libraries.tsv` **before** downloading or running any pipelines to ensure you have in the download scripts and/or pipeline input sheets only the actual library types you wish to use (e.g. you may only want paired-end data, or non-UDG treated data)._
> ⚠️ _We highly recommend generating and reviewing `AncientMetagenomeDir_filtered_libraries.tsv` **before** downloading or running any pipelines to ensure you have in the download scripts and/or pipeline input sheets only the actual library types you wish to use (e.g. you may only want paired-end data, or non-UDG treated data)._
> ⚠️ _To use a **pipeline input samplesheet**, you should always double check the sheet is correctly configured. We cannot guarantee accuracy between metadata and sequencing files._
> ⚠️ _To use a **pipeline input samplesheet**, you should always double check the sheet is correctly configured. We cannot guarantee accuracy between metadata and sequencing files._
All possible output is as follows:

- `<outdir>`: where all the pipeline samplesheets are placed (by default `.`)
- `AncientMetagenomeDir_bibliography.bib`:
- A BibTex format citation information file with all references (where available) present in the filtered sample table.
- `AncientMetagenomeDir_filtered_libraries.tsv`:
- The associated AncientMetagenomeDir curated metadata for all _libraries_ of the samples in the input table.
- `AncientMetagenomeDir_bibliography.bib`:
- A BibTex format citation information file with all references (where available) present in the filtered sample table.
- `AncientMetagenomeDir_filtered_libraries.tsv`:
- The associated AncientMetagenomeDir curated metadata for all _libraries_ of the samples in the input table.
- `AncientMetagenomeDir_curl_download_script.sh`:
- A bash script containing curl commands for all libraries in the input samples list.
- A bash script containing curl commands for all libraries in the input samples list.
- `AncientMetagenomeDir_aspera_download_script.sh`:
- A bash script containing Aspera commands for all libraries in the input samples list. See [How Tos](/how_to/miscellaneous) for Aspera configuration information.
- `AncientMetagenomeDir_nf_core_fetchngs_input_table.tsv`:
- An input sheet containing ERS/SRS accession numbers in a format compatible with the [nf-core/fetchngs](https://nf-co.re/fetchngs) input samplesheet.
- A bash script containing Aspera commands for all libraries in the input samples list. See [How Tos](/how_to/miscellaneous) for Aspera configuration information.
- `AncientMetagenomeDir_nf_core_fetchngs_input_table.tsv`:
- An input sheet containing ERS/SRS accession numbers in a format compatible with the [nf-core/fetchngs](https://nf-co.re/fetchngs) input samplesheet.
- `AncientMetagenomeDir_nf_core_eager_input_table.tsv`:
- An input sheet with metadata in a format compatible with the [nf-core/eager](https://nf-co.re/eager) input samplesheet.
- Contained paths are relative to the directory output when using the `curl` and `aspera` download scripts (i.e., input sheet assumes files are in the same directory as the input sheet itself).
- `AncientMetagenomeDir_nf_core_taxprofiler_input_table.csv`:
- An input sheet with metadata in a format compatible with the [nf-core/taxprofiler](https://nf-co.re/eager) input samplesheet.
- Contained paths are relative to the directory output when using the `curl` and `aspera` download scripts (i.e., input sheet assumes files are in the same directory as the input sheet itself).
- An input sheet with metadata in a format compatible with the [nf-core/eager](https://nf-co.re/eager) input samplesheet.
- Contained paths are relative to the directory output when using the `curl` and `aspera` download scripts (i.e., input sheet assumes files are in the same directory as the input sheet itself).
- `AncientMetagenomeDir_nf_core_taxprofiler_input_table.csv`:
- An input sheet with metadata in a format compatible with the [nf-core/taxprofiler](https://nf-co.re/eager) input samplesheet.
- Contained paths are relative to the directory output when using the `curl` and `aspera` download scripts (i.e., input sheet assumes files are in the same directory as the input sheet itself).
- `AncientMetagenomeDir_aMeta_input_table.tsv`:
- An input sheet with metadata in a format compatible with the [aMeta](https://github.com/NBISweden/aMeta) input samplesheet.
- Contained paths are relative to the directory output when using the `curl` and `aspera` download scripts (i.e., input sheet assumes files are in the same directory as the input sheet itself).
- `AncientMetagenomeDir_nf_core_mag_input_{single,paired}_table.csv`:
- An input sheet with metadata in a format compatible with the [nf-core/mag](https://nf-co.re/eager) input samplesheet.
- Contained paths are relative to the directory output when using the `curl` and `aspera` download scripts (i.e., input sheet assumes files are in the same directory as the input sheet itself).
- nf-core/mag does not support paired- and single-end data in the same run, therefore two sheets will be generated if your selected samples contain both types of libraries.
- An input sheet with metadata in a format compatible with the [aMeta](https://github.com/NBISweden/aMeta) input samplesheet.
- Contained paths are relative to the directory output when using the `curl` and `aspera` download scripts (i.e., input sheet assumes files are in the same directory as the input sheet itself).
- `AncientMetagenomeDir_nf_core_mag_input_{single,paired}_table.csv`:
- An input sheet with metadata in a format compatible with the [nf-core/mag](https://nf-co.re/eager) input samplesheet.
- Contained paths are relative to the directory output when using the `curl` and `aspera` download scripts (i.e., input sheet assumes files are in the same directory as the input sheet itself).
- nf-core/mag does not support paired- and single-end data in the same run, therefore two sheets will be generated if your selected samples contain both types of libraries.
33 changes: 33 additions & 0 deletions docs/source/how_to/merge.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# merge

## What

Merges a user-supplied metadata table with the latest AncientMetagenomeDir master metadata tables, with on-the-fly [validation](validation.md).

## When

This command would be used when you have a local version of an AncientMetagenomeDir table (samples or libraries) of just the new samples or libraries to add, and want to append to the current master table before submitting a pull request.

You typically only do this if preparing a pull request to the AncientMetagenomeDir repository entirely locally.

## How

The following description assumes you have already prepared a AncientMetagenomeDir **samples** or **libraries** table whose rows only consist of the header and new samples to be added.

> ⚠️ _The header, and present columns etc. should match exactly that on the corresponding AncientMetagenomeDir table_
Given a new samples table `samples_for_new_pr.tsv` to be added to the single genome samples table `ancientsinglegenome-hostassociated`, you can run the following command:

```bash
AMDirT merge -n ancientsinglegenome-hostassociated -t samples samples_for_new_pr.tsv
```

Note that during merge `merge` will also perform schema validation to ensure the contents of the new rows are valid against the AncientMetagenomeDir schema.

## Output

The output of the `merge` command is a new table with the merged rows named after the table you merged the new rows onto, placed by default in the directory you ran the command from (customisable with `-o`).

In the example above, the file result would be: `ancientsinglegenome-hostassociated_samples.tsv`.

The contents of this file can then theoretically be used to submit a pull request to the AncientMetagenomeDir repository.

0 comments on commit 77c2919

Please sign in to comment.