Skip to content

Commit fa7877f

Browse files
authored
fix: #97 preventing node confinment (#98)
fixes the bug described in #97
1 parent 2e5d308 commit fa7877f

File tree

2 files changed

+18
-8
lines changed

2 files changed

+18
-8
lines changed

docs/further.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## The general Idea
44

5-
To use this plugin, log in to your cluster's head node (sometimes called the "login" node), activate your environment as usual and start Snakemake. Snakemake will then submit your jobs as cluster jobs.
5+
To use this plugin, log in to your cluster's head node (sometimes called the "login" node), activate your environment as usual, and start Snakemake. Snakemake will then submit your jobs as cluster jobs.
66

77
## Specifying Account and Partition
88

@@ -86,6 +86,8 @@ other systems, e.g. by replacing `srun` with `mpiexec`:
8686
$ snakemake --set-resources calc_pi:mpi="mpiexec" ...
8787
```
8888

89+
To submit "ordinary" MPI jobs, submitting with `tasks` (the MPI ranks) is sufficient. Alternatively, on some clusters, it might be convenient to just configure `nodes`. Consider using a combination of `tasks` and `cpus_per_task` for hybrid applications (those that use ranks (multiprocessing) and threads). A detailed topology layout can be achieved using the `slurm_extra` parameter (see below) using further flags like `--distribution`.
90+
8991
## Running Jobs locally
9092

9193
Not all Snakemake workflows are adapted for heterogeneous environments, particularly clusters. Users might want to avoid the submission of _all_ rules as cluster jobs. Non-cluster jobs should usually include _short_ jobs, e.g. internet downloads or plotting rules.
@@ -158,8 +160,7 @@ set-resources:
158160
## Additional Custom Job Configuration
159161
160162
SLURM installations can support custom plugins, which may add support
161-
for additional flags to `sbatch`. In addition, there are various
162-
`sbatch` options not directly supported via the resource definitions
163+
for additional flags to `sbatch`. In addition, there are various batch options not directly supported via the resource definitions
163164
shown above. You may use the `slurm_extra` resource to specify
164165
additional flags to `sbatch`:
165166

@@ -210,7 +211,7 @@ shared-fs-usage:
210211
local-storage-prefix: "<your node local storage prefix>"
211212
```
212213

213-
It will set the executor to be this SLURM executor, ensure sufficient file system latency and allow automatic stage-in of files using the [file system storage plugin](https://github.com/snakemake/snakemake-storage-plugin-fs).
214+
It will set the executor to be this SLURM executor, ensure sufficient file system latency, and allow automatic stage-in of files using the [file system storage plugin](https://github.com/snakemake/snakemake-storage-plugin-fs).
214215

215216
Note, that you need to set the `SNAKEMAKE_PROFILE` environment variable in your `~/.bashrc` file, e.g.:
216217

snakemake_executor_plugin_slurm/__init__.py

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -123,14 +123,23 @@ def run_job(self, job: JobExecutorInterface):
123123
"- submitting without. This might or might not work on your cluster."
124124
)
125125

126-
# MPI job
127-
if job.resources.get("mpi", False):
128-
if job.resources.get("nodes", False):
129-
call += f" --nodes={job.resources.get('nodes', 1)}"
126+
if job.resources.get("nodes", False):
127+
call += f" --nodes={job.resources.get('nodes', 1)}"
130128

131129
# fixes #40 - set ntasks regarlless of mpi, because
132130
# SLURM v22.05 will require it for all jobs
133131
call += f" --ntasks={job.resources.get('tasks', 1)}"
132+
# MPI job
133+
if job.resources.get("mpi", False):
134+
if not job.resources.get("tasks_per_node") and not job.resources.get(
135+
"nodes"
136+
):
137+
self.logger.warning(
138+
"MPI job detected, but no 'tasks_per_node' or 'nodes' "
139+
"specified. Assuming 'tasks_per_node=1'."
140+
"Probably not what you want."
141+
)
142+
134143
call += f" --cpus-per-task={get_cpus_per_task(job)}"
135144

136145
if job.resources.get("slurm_extra"):

0 commit comments

Comments
 (0)