Skip to content

Commit 0368197

Browse files
cmeestersfgvieiracoderabbitai[bot]
authored
fix: sbatch stderr parsing (#161)
will hopefully fix #157 The issue is, that submission joined `stderr` and `stdout` of the `sbatch` call. Without add-ons `sbatch` only emits to `stdout` and to `stderr` only in the case of an error. However, admins can add informative messages to `stderr`, when this occurs, parsing the message for the JobID failed. Now, `stderr` and `stdout` are considered separately. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced error handling during SLURM job submission, providing clearer feedback on failures. - Improved job ID retrieval by stripping whitespace from the output. - **Bug Fixes** - Addressed issues with job submission failures by capturing both standard output and error messages. - **Chores** - Minor adjustments to logging for better clarity during job submission and error reporting. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Filipe G. Vieira <1151762+fgvieira@users.noreply.github.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
1 parent d98e4ac commit 0368197

File tree

1 file changed

+19
-4
lines changed

1 file changed

+19
-4
lines changed

snakemake_executor_plugin_slurm/__init__.py

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -217,20 +217,35 @@ def run_job(self, job: JobExecutorInterface):
217217

218218
self.logger.debug(f"sbatch call: {call}")
219219
try:
220-
out = subprocess.check_output(
221-
call, shell=True, text=True, stderr=subprocess.STDOUT
222-
).strip()
220+
process = subprocess.Popen(
221+
call,
222+
shell=True,
223+
text=True,
224+
stdout=subprocess.PIPE,
225+
stderr=subprocess.PIPE,
226+
)
227+
out, err = process.communicate()
228+
if process.returncode != 0:
229+
raise subprocess.CalledProcessError(
230+
process.returncode, call, output=err
231+
)
223232
except subprocess.CalledProcessError as e:
224233
raise WorkflowError(
225234
f"SLURM job submission failed. The error message was {e.output}"
226235
)
236+
if err: # any other error message?
237+
raise WorkflowError(
238+
f"SLURM job submission failed. The error message was {err}"
239+
)
227240

228241
# multicluster submissions yield submission infos like
229242
# "Submitted batch job <id> on cluster <name>" by default, but with the
230243
# --parsable option it simply yields "<id>;<name>".
231244
# To extract the job id we split by semicolon and take the first element
232245
# (this also works if no cluster name was provided)
233-
slurm_jobid = out.split(";")[0]
246+
slurm_jobid = out.strip().split(";")[0]
247+
if not slurm_jobid:
248+
raise WorkflowError("Failed to retrieve SLURM job ID from sbatch output.")
234249
slurm_logfile = slurm_logfile.replace("%j", slurm_jobid)
235250
self.logger.info(
236251
f"Job {job.jobid} has been submitted with SLURM jobid {slurm_jobid} "

0 commit comments

Comments
 (0)