Skip to content

BUG: Recording cellular data from MPI simulations is significantly slower than non-MPI #999

Open
@asoplata

Description

@asoplata

If one takes the code in the “Plot Firing Patterns” example, https://jonescompneurolab.github.io/hnn-core/stable/auto_examples/howto/plot_firing_pattern.html#sphx-glr-auto-examples-howto-plot-firing-pattern-py , the the time to record either the vsec or, if you slightly modify the code, isec data is effectively instant.

However, if one replaces the run-simulation line,

dpls = simulate_dipole(net, tstop=170., record_vsec='soma')

with the same thing except for using MPI:

with MPIBackend(n_procs=2):
    dpls = simulate_dipole(net, tstop=170., record_vsec='soma')

then the time to record takes significantly longer (about a minute on my machine).

Furthermore, if one tries to record the isec while running the sim using MPI, like with

with MPIBackend(n_procs=2):
    dpls = simulate_dipole(net, tstop=170., record_isec='soma')

then it takes so long that I eventually get an “MPI Simulation failed” error, and it fails to record the data. In this case, the reasons for the failure may be that 1. it takes a long time, and 2. we have a hard-coded timeout on MPI subprocesses. I believe this is the case since, if one modifies the run_subprocess timeout in hnn_core/parallel_backends.py::MPIBackend.simulate, then the data does get successfully get recorded, but it takes multiple minutes to do so (longer than in the vsec case).

I've marked this as a bug since this implies that in certain cases (e.g. the MPI example above when recording isec), it is not effectively possible for a user to use both MPI and to record some cellular properties, particularly for simulations that are of any substantial length. It is also not obvious to the user that stopping use of MPI would solve their problem, which it will in this case.

I will also note that the simulation time does not seem to vary with the recording: the issue clearly occurs only after a simulation has run, where the data is probably being aggregated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingparallelizationParallel processing

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions