From serialisation to parallelisation #6

TCLamnidis · 2022-04-19T08:11:26Z

In the current setup, run_Eager.sh uses nextflow to parallelise jobs within each batch, but batches are handled in series, with one run starting only after the previous one finishes/fails. To improve processing speeds I want to parallelise batches, so that up to 3 batches can run at once.

Multiple instances of nextflow cannot be launched from the same directory, as the .nextflow.log files will collide. A potential solution would be to bind each sequencing batch to a specific instance of run_Eager.sh, which would run every 3rd batch.

2020-05-03-batch1.eager_input.txt  ## instance 1
2020-05-03-batch2.eager_input.txt  ## instance 2
2020-06-26-batch3.eager_input.txt  ## instance 3
2020-06-26-batch4.eager_input.txt  ## instance 1
2020-06-26-batch5.eager_input.txt  ## instance 2
2020-06-26-batch6.eager_input.txt  ## instance 3

Since batches contain the initial creation date and are sorted alphabetically, their run_Eager.sh instance will be stable, allowing resuming without issue (🤞) .

The text was updated successfully, but these errors were encountered:

stschiff · 2022-04-19T11:15:43Z

But wouldn't they still be in the same directory? How does this solve the issue that you can't fire them off from the same dir?

TCLamnidis · 2022-04-19T11:40:23Z

Problem 1: each run needs its own directory
Solution: Initialise run from different directories. This is what I did the last weeks to speed up processing. But that raises problem 2.

Problem 2: When resuming processing of a run, it should be done from the same directory as the original run, else resuming restarts from scratch (i.e. past progress is ignored).
Solution: If the directory that nextflow is launched from is fixed for each batch, then resuming will also work as intended.

The extreme case for this would be to start all runs at the same time, launching nextflow for each batch in its own eager output directory. That would be fastest but would also block the cluster for everyone. So I'm leaning to a having a set number of "active" runs at a time.

TCLamnidis mentioned this issue Sep 2, 2024

ToDo list for next update #13

Open

18 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

From serialisation to parallelisation #6

From serialisation to parallelisation #6

TCLamnidis commented Apr 19, 2022 •

edited

Loading

stschiff commented Apr 19, 2022

TCLamnidis commented Apr 19, 2022

From serialisation to parallelisation #6

From serialisation to parallelisation #6

Comments

TCLamnidis commented Apr 19, 2022 • edited Loading

stschiff commented Apr 19, 2022

TCLamnidis commented Apr 19, 2022

TCLamnidis commented Apr 19, 2022 •

edited

Loading