Skip to content

Commit 9f1d5d0

Browse files
authored
Merge pull request #54 from harvardinformatics/snakemake-plugin
Snakemake plugin
2 parents ef8074a + 90472de commit 9f1d5d0

6 files changed

+77
-5
lines changed

docs/faq/index.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
[Research Computing](https://www.rc.fas.harvard.edu/) manages the Cannon cluster, among other things, and provides advice and support on HPC and related hardware and software questions. The Informatics Group supports specific software and analysis needs, including providing support for core facility software via the Software Operations group, and providing training, consultation, and collaborative project work for bioinformatics needs through the Bioinformatics group.
2323

2424
You can contact Research Computing via their [contact page](https://www.rc.fas.harvard.edu/about/contact/) for any questions related to HPC hardware or software environments. You can contact FAS Informatics for questions related to bioinformatics support via our [contact page](../contact.md).
25+
2526
??? question "How can I know about future workshops?"
2627

2728
##### How can I know about future workshops?
@@ -76,6 +77,12 @@
7677

7778
For all questions, you can use the [contact form](../contact.md). For possibly quicker answers, you can try our public slack channel (FAS Bioinformatics Public). For hands-on help, come to our office hours in Northwest Labs B227 (see [Events](../events.md) for times).
7879

80+
??? question "How can I run a Snakemake workflow on the Cannon cluster?"
81+
82+
##### Snakemake on the Cannon cluster
83+
84+
We have developed a [Snakemake plugin for the Cannon cluster](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/cannon.html), based on the [generic SLURM plugin](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/slurm.html). See [the documentation](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/cannon.html) for information on how to install and use it, and feel free to report [issues or questions on the github repo](https://github.com/harvardinformatics/snakemake-executor-plugin-cannon).
85+
7986

8087
## Bauer Core Sequencing
8188

docs/resources/Tutorials/add-outgroup-to-whole-genome-alignment-cactus.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,18 @@ If the help menu displays, you already have Singularity installed. If not, you w
6565
mamba install conda-forge::singularity
6666
```
6767

68+
!!! tip "Cannon cluster Snakemake plugin"
69+
70+
If you are on the Harvard Cannon cluster, instead of the generic snakemake-executor-plugin-slurm, you can use our specific plugin for the Cannon cluster: [snakemake-executor-plugin-cannon](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/cannon.html). This facilitates *automatic partition selection* based on requested resources. Install this in your environment with:
71+
72+
```bash
73+
mamba install bioconda::snakemake-executor-plugin-cannon
74+
```
75+
76+
Then, when running the workflow, specify the cannon executor with `-e cannon` instead of `-e slurm`.
77+
78+
If you are not on the Harvard Cannon cluster, stick with the generic SLURM plugin. You will just need to directly specify the partitions for each rule in the config file ([see below](#specifying-resources-for-each-rule)).
79+
6880
### Downloading the cactus-snakemake pipeline
6981

7082
The [pipeline](https://github.com/harvardinformatics/cactus-snakemake/) is currently available on github. You can install it on the Harvard cluster or any computer that has `git` installed by navigating to the directory in which you want to download it and doing one of the following:
@@ -245,6 +257,7 @@ rule_resources:
245257
* **Allocate the proper partitions based on `use_gpu`.** If you want to use the GPU version of cactus (*i.e.* you have set `use_gpu: True` in the config file), the partition for the rule **blast** must be GPU enabled. If not, the pipeline will fail to run.
246258
* The `blast: gpus:` option will be ignored if `use_gpu: False` is set.
247259
* **mem is in MB** and **time is in minutes**.
260+
* **If using the [snakemake-executor-plugin-cannon](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/cannon.html) specifically for the Harvard Cannon cluster, you can leave the `partition:` fields blank and one will be selected automatically based on the other resources requested!**
248261
249262
You will have to determine the proper resource usage for your dataset. Generally, the larger the genomes, the more time and memory each job will need, and the more you will benefit from providing more CPUs and GPUs.
250263
@@ -270,7 +283,7 @@ snakemake -j <# of jobs to submit simultaneously> -e slurm -s </path/to/cactus_a
270283
| ------------------------------------------------- | ----------- |
271284
| `snakemake` | The call to the snakemake workflow program to execute the workflow. |
272285
| `-j <# of jobs to submit simultaneously>` | The maximum number of jobs that will be submitted to your SLURM cluster at one time. |
273-
| `-e slurm` | Specify to use the SLURM executor plugin. See: [Getting started](#getting-started). |
286+
| `-e slurm` | Specify to use the SLURM executor plugin, or use `-e cannon` if using the Cannon specific plugin. See: [Getting started](#getting-started) |
274287
| `-s </path/to/cactus_add_outgroup.smk>` | The path to the workflow file. |
275288
| `--configfile <path/to/your/snakmake-config.yml>` | The path to your config file. See: [Preparing the Snakemake config file](#preparing-the-snakemake-config-file). |
276289
| `--dryrun` | Do not execute anything, just display what would be done. |

docs/resources/Tutorials/add-to-whole-genome-alignment-cactus.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,18 @@ If the help menu displays, you already have Singularity installed. If not, you w
6565
mamba install conda-forge::singularity
6666
```
6767

68+
!!! tip "Cannon cluster Snakemake plugin"
69+
70+
If you are on the Harvard Cannon cluster, instead of the generic snakemake-executor-plugin-slurm, you can use our specific plugin for the Cannon cluster: [snakemake-executor-plugin-cannon](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/cannon.html). This facilitates *automatic partition selection* based on requested resources. Install this in your environment with:
71+
72+
```bash
73+
mamba install bioconda::snakemake-executor-plugin-cannon
74+
```
75+
76+
Then, when running the workflow, specify the cannon executor with `-e cannon` instead of `-e slurm`.
77+
78+
If you are not on the Harvard Cannon cluster, stick with the generic SLURM plugin. You will just need to directly specify the partitions for each rule in the config file ([see below](#specifying-resources-for-each-rule)).
79+
6880
### Downloading the cactus-snakemake pipeline
6981

7082
The [pipeline](https://github.com/harvardinformatics/cactus-snakemake/) is currently available on github. You can install it on the Harvard cluster or any computer that has `git` installed by navigating to the directory in which you want to download it and doing one of the following:
@@ -257,6 +269,7 @@ rule_resources:
257269
* **Allocate the proper partitions based on `use_gpu`.** If you want to use the GPU version of cactus (*i.e.* you have set `use_gpu: True` in the config file), the partition for the rule **blast** must be GPU enabled. If not, the pipeline will fail to run.
258270
* The `blast: gpus:` option will be ignored if `use_gpu: False` is set.
259271
* **mem is in MB** and **time is in minutes**.
272+
* **If using the [snakemake-executor-plugin-cannon](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/cannon.html) specifically for the Harvard Cannon cluster, you can leave the `partition:` fields blank and one will be selected automatically based on the other resources requested!**
260273
261274
You will have to determine the proper resource usage for your dataset. Generally, the larger the genomes, the more time and memory each job will need, and the more you will benefit from providing more CPUs and GPUs.
262275
@@ -282,7 +295,7 @@ snakemake -j <# of jobs to submit simultaneously> -e slurm -s </path/to/cactus_u
282295
| ------------------------------------------------- | ----------- |
283296
| `snakemake` | The call to the snakemake workflow program to execute the workflow. |
284297
| `-j <# of jobs to submit simultaneously>` | The maximum number of jobs that will be submitted to your SLURM cluster at one time. |
285-
| `-e slurm` | Specify to use the SLURM executor plugin. See: [Getting started](#getting-started). |
298+
| `-e slurm` | Specify to use the SLURM executor plugin, or use `-e cannon` if using the Cannon specific plugin. See: [Getting started](#getting-started) |
286299
| `-s </path/to/cactus_update.smk>` | The path to the workflow file. |
287300
| `--configfile <path/to/your/snakmake-config.yml>` | The path to your config file. See: [Preparing the Snakemake config file](#preparing-the-snakemake-config-file). |
288301
| `--dryrun` | Do not execute anything, just display what would be done. |

docs/resources/Tutorials/pangenome-cactus-minigraph.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,18 @@ If the help menu displays, you already have Singularity installed. If not, you w
5858
mamba install conda-forge::singularity
5959
```
6060

61+
!!! tip "Cannon cluster Snakemake plugin"
62+
63+
If you are on the Harvard Cannon cluster, instead of the generic snakemake-executor-plugin-slurm, you can use our specific plugin for the Cannon cluster: [snakemake-executor-plugin-cannon](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/cannon.html). This facilitates *automatic partition selection* based on requested resources. Install this in your environment with:
64+
65+
```bash
66+
mamba install bioconda::snakemake-executor-plugin-cannon
67+
```
68+
69+
Then, when running the workflow, specify the cannon executor with `-e cannon` instead of `-e slurm`.
70+
71+
If you are not on the Harvard Cannon cluster, stick with the generic SLURM plugin. You will just need to directly specify the partitions for each rule in the config file ([see below](#specifying-resources-for-each-rule)).
72+
6173
### Downloading the cactus-snakemake pipeline
6274

6375
The [pipeline](https://github.com/harvardinformatics/cactus-snakemake/) is currently available on github. You can install it on the Harvard cluster or any computer that has `git` installed by navigating to the directory in which you want to download it and doing one of the following:
@@ -175,6 +187,7 @@ rule_resources:
175187
* Be sure to use partition names appropriate your cluster. Several examples in this tutorial have partition names that are specific to the Harvard cluster, so be sure to change them.
176188
* The steps in the cactus-minigraph pipeline are not GPU compatible, so there are no GPU options in this pipeline.
177189
* **mem_mb is in MB** and **time is in minutes**.
190+
* **If using the [snakemake-executor-plugin-cannon](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/cannon.html) specifically for the Harvard Cannon cluster, you can leave the `partition:` fields blank and one will be selected automatically based on the other resources requested!**
178191

179192
You will have to determine the proper resource usage for your dataset. Generally, the larger the genomes, the more time and memory each job will need, and the more you will benefit from providing more CPUs.
180193

@@ -198,7 +211,7 @@ snakemake -j <# of jobs to submit simultaneously> -e slurm -s </path/to/cactus_m
198211
| ------------------------------------------------- | ----------- |
199212
| `snakemake` | The call to the snakemake workflow program to execute the workflow. |
200213
| `-j <# of jobs to submit simultaneously>` | The maximum number of jobs that will be submitted to your SLURM cluster at one time. |
201-
| `-e slurm` | Specify to use the SLURM executor plugin. See: [Getting started](#getting-started). |
214+
| `-e slurm` | Specify to use the SLURM executor plugin, or use `-e cannon` if using the Cannon specific plugin. See: [Getting started](#getting-started) |
202215
| `-s </path/to/cactus_minigraph.smk>` | The path to the workflow file. |
203216
| `--configfile <path/to/your/snakmake-config.yml>` | The path to your config file. See: [Preparing the Snakemake config file](#preparing-the-snakemake-config-file). |
204217
| `--dryrun` | Do not execute anything, just display what would be done. |

docs/resources/Tutorials/replace-genome-whole-genome-alignment-cactus.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,18 @@ If the help menu displays, you already have Singularity installed. If not, you w
6565
mamba install conda-forge::singularity
6666
```
6767

68+
!!! tip "Cannon cluster Snakemake plugin"
69+
70+
If you are on the Harvard Cannon cluster, instead of the generic snakemake-executor-plugin-slurm, you can use our specific plugin for the Cannon cluster: [snakemake-executor-plugin-cannon](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/cannon.html). This facilitates *automatic partition selection* based on requested resources. Install this in your environment with:
71+
72+
```bash
73+
mamba install bioconda::snakemake-executor-plugin-cannon
74+
```
75+
76+
Then, when running the workflow, specify the cannon executor with `-e cannon` instead of `-e slurm`.
77+
78+
If you are not on the Harvard Cannon cluster, stick with the generic SLURM plugin. You will just need to directly specify the partitions for each rule in the config file ([see below](#specifying-resources-for-each-rule)).
79+
6880
### Downloading the cactus-snakemake pipeline
6981

7082
The [pipeline](https://github.com/harvardinformatics/cactus-snakemake/) is currently available on github. You can install it on the Harvard cluster or any computer that has `git` installed by navigating to the directory in which you want to download it and doing one of the following:
@@ -226,6 +238,7 @@ rule_resources:
226238
* **Allocate the proper partitions based on `use_gpu`.** If you want to use the GPU version of cactus (*i.e.* you have set `use_gpu: True` in the config file), the partition for the rule **blast** must be GPU enabled. If not, the pipeline will fail to run.
227239
* The `blast: gpus:` option will be ignored if `use_gpu: False` is set.
228240
* **mem is in MB** and **time is in minutes**.
241+
* **If using the [snakemake-executor-plugin-cannon](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/cannon.html) specifically for the Harvard Cannon cluster, you can leave the `partition:` fields blank and one will be selected automatically based on the other resources requested!**
229242
230243
You will have to determine the proper resource usage for your dataset. Generally, the larger the genomes, the more time and memory each job will need, and the more you will benefit from providing more CPUs and GPUs.
231244
@@ -251,7 +264,7 @@ snakemake -j <# of jobs to submit simultaneously> -e slurm -s </path/to/cactus_r
251264
| ------------------------------------------------- | ----------- |
252265
| `snakemake` | The call to the snakemake workflow program to execute the workflow. |
253266
| `-j <# of jobs to submit simultaneously>` | The maximum number of jobs that will be submitted to your SLURM cluster at one time. |
254-
| `-e slurm` | Specify to use the SLURM executor plugin. See: [Getting started](#getting-started). |
267+
| `-e slurm` | Specify to use the SLURM executor plugin, or use `-e cannon` if using the Cannon specific plugin. See: [Getting started](#getting-started) |
255268
| `-s </path/to/cactus_update.smk>` | The path to the workflow file. |
256269
| `--configfile <path/to/your/snakmake-config.yml>` | The path to your config file. See: [Preparing the Snakemake config file](#preparing-the-snakemake-config-file). |
257270
| `--dryrun` | Do not execute anything, just display what would be done. |

docs/resources/Tutorials/whole-genome-alignment-cactus.md

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,18 @@ If the help menu displays, you already have Singularity installed. If not, you w
5757
mamba install conda-forge::singularity
5858
```
5959

60+
!!! tip "Cannon cluster Snakemake plugin"
61+
62+
If you are on the Harvard Cannon cluster, instead of the generic snakemake-executor-plugin-slurm, you can use our specific plugin for the Cannon cluster: [snakemake-executor-plugin-cannon](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/cannon.html). This facilitates *automatic partition selection* based on requested resources. Install this in your environment with:
63+
64+
```bash
65+
mamba install bioconda::snakemake-executor-plugin-cannon
66+
```
67+
68+
Then, when running the workflow, specify the cannon executor with `-e cannon` instead of `-e slurm`.
69+
70+
If you are not on the Harvard Cannon cluster, stick with the generic SLURM plugin. You will just need to directly specify the partitions for each rule in the config file ([see below](#specifying-resources-for-each-rule)).
71+
6072
### Downloading the cactus-snakemake pipeline
6173

6274
The [pipeline](https://github.com/harvardinformatics/cactus-snakemake/) is currently available on github. You can install it on the Harvard cluster or any computer that has `git` installed by navigating to the directory in which you want to download it and doing one of the following:
@@ -194,6 +206,7 @@ rule_resources:
194206
* **Allocate the proper partitions based on `use_gpu`.** If you want to use the GPU version of cactus (*i.e.* you have set `use_gpu: True` in the config file), the partition for the rule **blast** must be GPU enabled. If not, the pipeline will fail to run.
195207
* The `blast: gpus:` option will be ignored if `use_gpu: False` is set.
196208
* **mem_mb is in MB** and **time is in minutes**.
209+
* **If using the [snakemake-executor-plugin-cannon](https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/cannon.html) specifically for the Harvard Cannon cluster, you can leave the `partition:` fields blank and one will be selected automatically based on the other resources requested!**
197210

198211
You will have to determine the proper resource usage for your dataset. Generally, the larger the genomes, the more time and memory each job will need, and the more you will benefit from providing more CPUs and GPUs.
199212

@@ -257,7 +270,7 @@ snakemake -j <# of jobs to submit simultaneously> -e slurm -s </path/to/cactus.s
257270
| ------------------------------------------------- | ----------- |
258271
| `snakemake` | The call to the snakemake workflow program to execute the workflow. |
259272
| `-j <# of jobs to submit simultaneously>` | The maximum number of jobs that will be submitted to your SLURM cluster at one time. |
260-
| `-e slurm` | Specify to use the SLURM executor plugin. See: [Getting started](#getting-started). |
273+
| `-e slurm` | Specify to use the SLURM executor plugin, or use `-e cannon` if using the Cannon specific plugin. See: [Getting started](#getting-started) |
261274
| `-s </path/to/cactus.smk>` | The path to the workflow file. |
262275
| `--configfile <path/to/your/snakmake-config.yml>` | The path to your config file. See: [Preparing the Snakemake config file](#preparing-the-snakemake-config-file). |
263276
| `--dryrun` | Do not execute anything, just display what would be done. |

0 commit comments

Comments
 (0)