Skip to content

Commit 06a1555

Browse files
authored
docs: added paragraphs about dynamic resource allocation (#79)
1 parent c9bc75c commit 06a1555

File tree

1 file changed

+52
-1
lines changed

1 file changed

+52
-1
lines changed

docs/further.md

Lines changed: 52 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,57 @@ export SNAKEMAKE_PROFILE="$HOME/.config/snakemake"
221221

222222
Further note, that there is further development ongoing to enable differentiation of file access patterns.
223223

224+
## Retries - Or Trying again when a Job failed
225+
226+
Some cluster jobs may fail. In this case Snakemake can be instructed to try another submit before the entire workflow fails, in this example up to 3 times:
227+
228+
```console
229+
snakemake --retries=3
230+
```
231+
232+
If a workflow fails entirely (e.g. when there are cluster failures), it can be resumed as any other Snakemake workflow:
233+
234+
```console
235+
snakemake --rerun-incomplete
236+
```
237+
238+
To prevent failures due to faulty parameterization, we can dynamically adjust the runtime behaviour:
239+
240+
## Dynamic Parameterization
241+
242+
Using dynamic parameterization we can react on different different inputs and prevent our HPC jobs from failing.
243+
244+
### Adjusting Memory Requirements
245+
246+
Input size of files may vary. [If we have an estimate for the RAM requirement due to varying input file sizes, we can use this to dynamically adjust our jobs.](https://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#dynamic-resources)
247+
248+
### Adjusting Runtime
249+
250+
Runtime adjustments can be made in a Snakefile:
251+
252+
```Python
253+
def get_time(wildcards, attempt):
254+
return f"{1 * attempt}h"
255+
256+
rule foo:
257+
input: ...
258+
output: ...
259+
resources:
260+
runtime=get_time
261+
...
262+
```
263+
264+
or in a workflow profile
265+
266+
```YAML
267+
set-resources:
268+
foo:
269+
runtime: f"{1 * attempt}h"
270+
```
271+
272+
Be sure to use sensible settings for your cluster and make use of parallel execution (e.g. threads) and [global profiles](#using-profiles) to avoid I/O contention.
273+
274+
224275
## Summary:
225276

226277
When put together, a frequent command line looks like:
@@ -231,4 +282,4 @@ $ snakemake --workflow-profile <path> \
231282
> --default-resources slurm_account=<account> slurm_partition=<default partition> \
232283
> --configfile config/config.yaml \
233284
> --directory <path> # assuming a data path not relative to the workflow
234-
```
285+
```

0 commit comments

Comments
 (0)