Skip to content

Commit 26cd8ce

Browse files
gtdangasoplata
andauthored
[MRG] Fix GUI MPI available cores (#871)
* feat: updated run_button_clicked to take a number of cores argument * feat: created a static method to count the number of available cores based on the operation system * feat: added number of available cores as an attribute and applied the attribute where relevant. * feat: added number of cores selection for the mpi backend * fix: updated the gui compose so that backend sub-options are displayed on start-up Previously the sub-options were hidden by default and only displayed when the backend dropdown was changed. This hid the number of cores option for the default joblib backend on start-up. * chore: removed unused argument n_cores from run_button_clicked function * docs: doc string for _available_cores method * feat: changes the default backend based on if mpi4py and psutil are installed * fix: force joblib for test_gui_run_simulations * chore: resolve merge conflict * feat: refactor core/thread logic for mpibackend This takes George's old GUI-specific `_available_cores()` method, moves it, and greatly expands it to include updates to the logic about cores and hardware-threading which was previously inside `MPIBackend.__init__()`. This was necessary due to the number of common but different outcomes based on platform, architecture, hardware-threading support, and user choice. These changes do not involve very many lines of code, but a good amount of thought and testing has gone into them. Importantly, these `MPIBackend` API changes are backwards-compatible, and no changes to current usage code are needed. I suggest you read the long comments in `parallel_backends.py::_determine_cores_hwthreading()` outlining how each variation is handled. Previously, if the user did not provide the number of MPI Processes they wanted to use, `MPIBackend` assumed that the number of detected "logical" cores would suffice. As George previously showed, this does not work for HPC environments like on OSCAR, where the only true number of cores that we are allowed to use is found by `psutil.Process().cpu_affinity()`, the "affinity" core number. There is a third type of number of cores besides "logical" and "affinity" which is important: "physical". However, there was an additional problem here that was still unaddressed: hardware-threading. Different platforms and situations report different numbers of logical, affinity, and physical CPU cores. One of the factors that affects this is if there is hardware-threading present on the machine, such as Intel Hyperthreading. In the case of an example Linux laptop having an Intel chip with Hyperthreading, the logical and physical core numbers will report different values with respect to each other: logical includes Hyperthreads (e.g. `psutil.cpu_count(logical=True)` reports 8 cores), but physical does not (e.g. `psutil.cpu_count(logical=False)` reports 4 cores). If we tell MPI to use 8 cores ("logical"), then we ALSO need to tell it to also enable the hardware-threading option. However, if the user does not want to enable hardware-threading, then we need to make this an option, tell MPI to use 4 cores ("physical"), and tell MPI to not use the hardware-threading option. The "affinity" core number makes things even more complicated, since in the Linux laptop example, it is equal to the logical core number. However, on OSCAR, it is very different than the logical core number, and on Macos, it is not present at all. In `_determine_cores_hwthreading()`, if you read the lengthy comments, I have thought through each common scenario, and I believe resolved what to do for each, with respect to the number of cores to use and whether or not to use hardware-threading. These scenarios include: the user choosing to use hardware-threading (default) or not, across Macos variations with and without hardware-threading, Linux local computer variations with and without hardware-threading, and Linux HPC (e.g. OSCAR) variations which appear to never support hardware-threading. In the Windows case, due to both #589 and the currently-untested MPI integration on Windows, I always report the machine as not having hardware-threading. Additionally, previously, if the user did provide a number for MPI Processes, `MPIBackend` used some "heuristics" to decide whether to use MPI oversubscription and/or hardware-threading, but the user could not override these heuristics. Now, when a user instantiates an `MPIBackend` with `__init__()` and uses the defaults, hardware-threading is detected more robustly and enabled by default, and oversubscription is enabled based on its own heuristics; this is the case when the new arguments `hwthreading` and `oversubscribe` are set to their default value of `None`. However, if the user knows what they're doing, they can also pass either `True` or `False` to either of these options to force them on or off. Furthermore, in the case of `hwthreading`, if the user indicates they do not want to use it, then `_determine_cores_hwthreading()` correctly returns the number of NON-hardware-threaded cores for MPI's use, instead of the core number including hardware-threads. I have also modified and expanded the appropriate testing to compensate for these changes. Note that this does NOT change the default number of jobs to use for the GUI if MPI is detected. Such a change breaks the current `test_gui.py` testing: see #960 #960 * chore: docstring update * doc: attempt to clarify core detection docs * maint: address ntolley comments * maint: remove unused warning wrapper * Change sensible defaults to true * Add dylan warning, separate oversub/hwthread tests * Remove problematic macos oversubscription sim * Fix incorrect doc comment --------- Co-authored-by: Austin E. Soplata <me@asoplata.com>
1 parent e1df662 commit 26cd8ce

File tree

4 files changed

+530
-81
lines changed

4 files changed

+530
-81
lines changed

hnn_core/gui/gui.py

Lines changed: 42 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,6 @@
77
import io
88
import logging
99
import mimetypes
10-
import multiprocessing
1110
import numpy as np
1211
import sys
1312
import json
@@ -35,6 +34,9 @@
3534
get_L5Pyr_params_default)
3635
from hnn_core.hnn_io import dict_to_network, write_network_configuration
3736
from hnn_core.cells_default import _exp_g_at_dist
37+
from hnn_core.parallel_backends import (_determine_cores_hwthreading,
38+
_has_mpi4py,
39+
_has_psutil)
3840

3941
hnn_core_root = Path(hnn_core.__file__).parent
4042
default_network_configuration = (hnn_core_root / 'param' /
@@ -344,6 +346,12 @@ def __init__(self, theme_color="#802989",
344346
# load default parameters
345347
self.params = self.load_parameters(network_configuration)
346348

349+
# Number of available cores
350+
[self.n_cores, _] = _determine_cores_hwthreading(
351+
use_hwthreading_if_found=False,
352+
sensible_default_cores=True,
353+
)
354+
347355
# In-memory storage of all simulation and visualization related data
348356
self.simulation_data = defaultdict(lambda: dict(net=None, dpls=list()))
349357

@@ -394,15 +402,17 @@ def __init__(self, theme_color="#802989",
394402
placeholder='ID of your simulation',
395403
description='Name:',
396404
disabled=False)
397-
self.widget_backend_selection = Dropdown(options=[('Joblib', 'Joblib'),
398-
('MPI', 'MPI')],
399-
value='Joblib',
400-
description='Backend:')
405+
self.widget_backend_selection = (
406+
Dropdown(options=[('Joblib', 'Joblib'),
407+
('MPI', 'MPI')],
408+
value=self._check_backend(),
409+
description='Backend:'))
401410
self.widget_mpi_cmd = Text(value='mpiexec',
402411
placeholder='Fill if applies',
403412
description='MPI cmd:', disabled=False)
404-
self.widget_n_jobs = BoundedIntText(value=1, min=1,
405-
max=multiprocessing.cpu_count(),
413+
self.widget_n_jobs = BoundedIntText(value=1,
414+
min=1,
415+
max=self.n_cores,
406416
description='Cores:',
407417
disabled=False)
408418
self.load_data_button = FileUpload(
@@ -490,6 +500,14 @@ def __init__(self, theme_color="#802989",
490500
self._init_ui_components()
491501
self.add_logging_window_logger()
492502

503+
@staticmethod
504+
def _check_backend():
505+
"""Checks for MPI and returns the default backend name"""
506+
default_backend = 'Joblib'
507+
if _has_mpi4py() and _has_psutil():
508+
default_backend = 'MPI'
509+
return default_backend
510+
493511
def get_cell_parameters_dict(self):
494512
"""Returns the number of elements in the
495513
cell_parameters_dict dictionary.
@@ -632,8 +650,9 @@ def _run_button_clicked(b):
632650
self.fig_default_params, self.widget_default_smoothing,
633651
self.widget_min_frequency, self.widget_max_frequency,
634652
self.widget_ntrials, self.widget_backend_selection,
635-
self.widget_mpi_cmd, self.widget_n_jobs, self.params,
636-
self._simulation_status_bar, self._simulation_status_contents,
653+
self.widget_mpi_cmd, self.widget_n_jobs,
654+
self.params, self._simulation_status_bar,
655+
self._simulation_status_contents,
637656
self.connectivity_widgets, self.viz_manager,
638657
self.simulation_list_widget, self.cell_pameters_widgets)
639658

@@ -758,6 +777,10 @@ def compose(self, return_layout=True):
758777
self.widget_max_frequency,
759778
])
760779
], layout=self.layout['config_box'])
780+
# Displays the default backend options
781+
handle_backend_change(self.widget_backend_selection.value,
782+
self._backend_config_out, self.widget_mpi_cmd,
783+
self.widget_n_jobs)
761784

762785
connectivity_configuration = Tab()
763786

@@ -2071,8 +2094,16 @@ def run_button_clicked(widget_simulation_name, log_out, drive_widgets,
20712094

20722095
print("start simulation")
20732096
if backend_selection.value == "MPI":
2097+
# 'use_hwthreading_if_found' and 'sensible_default_cores' have
2098+
# already been set elsewhere, and do not need to be re-set here.
2099+
# Hardware-threading and oversubscription will always be disabled
2100+
# to prevent edge cases in the GUI.
20742101
backend = MPIBackend(
2075-
n_procs=multiprocessing.cpu_count() - 1, mpi_cmd=mpi_cmd.value)
2102+
n_procs=n_jobs.value,
2103+
mpi_cmd=mpi_cmd.value,
2104+
override_hwthreading_option=False,
2105+
override_oversubscribe_option=False,
2106+
)
20762107
else:
20772108
backend = JoblibBackend(n_jobs=n_jobs.value)
20782109
print(f"Using Joblib with {n_jobs.value} core(s).")
@@ -2379,7 +2410,7 @@ def handle_backend_change(backend_type, backend_config, mpi_cmd, n_jobs):
23792410
backend_config.clear_output()
23802411
with backend_config:
23812412
if backend_type == "MPI":
2382-
display(mpi_cmd)
2413+
display(VBox(children=[n_jobs, mpi_cmd]))
23832414
elif backend_type == "Joblib":
23842415
display(n_jobs)
23852416

0 commit comments

Comments
 (0)