Skip to content

Commit

Permalink
Put input parameters for MPI in a separate category
Browse files Browse the repository at this point in the history
  • Loading branch information
AlexanderSinn committed Mar 13, 2024
1 parent 89e7976 commit f7c9b1a
Show file tree
Hide file tree
Showing 10 changed files with 37 additions and 42 deletions.
2 changes: 1 addition & 1 deletion docs/source/building/platforms/booster_jsc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ and use it to submit a simulation.

.. tip::
Parallel simulations can be largely accelerated by using GPU-aware MPI.
To utilize GPU-aware MPI, the input parameter ``hipace.comms_buffer_on_gpu = 1`` must be set.
To utilize GPU-aware MPI, the input parameter ``comms_buffer.on_gpu = 1`` must be set.

Note that using GPU-aware MPI may require more GPU memory.

Expand Down
2 changes: 1 addition & 1 deletion docs/source/building/platforms/lumi_csc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ and use it to submit a simulation.
.. tip::
Parallel simulations can be largely accelerated by using GPU-aware MPI.
To utilize GPU-aware MPI, the input parameter ``hipace.comms_buffer_on_gpu = 1`` must be set and the following flag must be passed in the job script:
To utilize GPU-aware MPI, the input parameter ``comms_buffer.on_gpu = 1`` must be set and the following flag must be passed in the job script:
.. code-block:: bash
Expand Down
2 changes: 1 addition & 1 deletion docs/source/building/platforms/maxwell_desy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ for more details and the required constraints). Please set the value accordingly
.. tip::
Parallel simulations can be largely accelerated by using GPU-aware MPI.
To utilize GPU-aware MPI, the input parameter ``hipace.comms_buffer_on_gpu = 1`` must be set and the following flag must be passed in the job script:
To utilize GPU-aware MPI, the input parameter ``comms_buffer.on_gpu = 1`` must be set and the following flag must be passed in the job script:

.. code-block:: bash
Expand Down
4 changes: 2 additions & 2 deletions docs/source/building/platforms/perlmutter_nersc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ You can then create your directory in your ``$PSCRATCH``, where you can put your
export MPICH_OFI_NIC_POLICY=GPU
# for GPU-aware MPI use the first line
#HIPACE_GPU_AWARE_MPI="hipace.comms_buffer_on_gpu=1"
#HIPACE_GPU_AWARE_MPI="comms_buffer.on_gpu=1"
HIPACE_GPU_AWARE_MPI=""
# CUDA visible devices are ordered inverse to local task IDs
Expand All @@ -94,6 +94,6 @@ and use it to submit a simulation. Note, that this example simulation runs on 8

.. tip::
Parallel simulations can be largely accelerated by using GPU-aware MPI.
To utilize GPU-aware MPI, the input parameter ``hipace.comms_buffer_on_gpu = 1`` must be set (see the job script above).
To utilize GPU-aware MPI, the input parameter ``comms_buffer.on_gpu = 1`` must be set (see the job script above).

Note that using GPU-aware MPI may require more GPU memory.
12 changes: 6 additions & 6 deletions docs/source/run/parameters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -80,24 +80,24 @@ General parameters
By default, we use the ``nosmt`` option, which overwrites the OpenMP default of spawning one thread per logical CPU core, and instead only spawns a number of threads equal to the number of physical CPU cores on the machine.
If set, the environment variable ``OMP_NUM_THREADS`` takes precedence over ``system`` and ``nosmt``, but not over integer numbers set in this option.

* ``hipace.comms_buffer_on_gpu`` (`bool`) optional (default `0`)
* ``comms_buffer.on_gpu`` (`bool`) optional (default `0`)
Whether the buffers that hold the beam and the 3D laser envelope should be allocated on the GPU (device memory).
By default they will be allocated on the CPU (pinned memory).
Setting this option to `1` is necessary to take advantage of GPU-Enabled MPI, however for this
additional enviroment variables need to be set depending on the system.

* ``hipace.comms_buffer_max_leading_slices`` (`int`) optional (default `inf`)
* ``comms_buffer.max_leading_slices`` (`int`) optional (default `inf`)
How many slices of beam particles can be received and stored in advance.

* ``hipace.comms_buffer_max_trailing_slices`` (`int`) optional (default `inf`)
* ``comms_buffer.max_trailing_slices`` (`int`) optional (default `inf`)
How many slices of beam particles can be stored before being sent. Using
``comms_buffer_max_leading_slices`` and ``comms_buffer_max_trailing_slices`` will in principle
``comms_buffer.max_leading_slices`` and ``comms_buffer.max_trailing_slices`` will in principle
limit the amount of asynchronousness in the parallel communication and may thus reduce performance.
However it may be necessary to set these parameters to avoid all slices accumulating on a single
rank that would run out of memory (out of CPU or GPU memory depending on ``hipace.comms_buffer_on_gpu``).
rank that would run out of memory (out of CPU or GPU memory depending on ``comms_buffer.on_gpu``).
If there are more time steps than ranks, these parameters must be chosen such that between all
ranks there is enough capacity to store every slice to avoid a deadlock, i.e.
``comms_buffer_max_trailing_slices * nranks > nslices``.
``comms_buffer.max_trailing_slices * nranks > nslices``.

* ``hipace.do_tiling`` (`bool`) optional (default `true`)
Whether to use tiling, when running on CPU.
Expand Down
6 changes: 0 additions & 6 deletions src/Hipace.H
Original file line number Diff line number Diff line change
Expand Up @@ -258,12 +258,6 @@ public:
amrex::Parser m_salame_parser;
/** Function to get the target Ez field for SALAME */
amrex::ParserExecutor<3> m_salame_target_func;
/** Whether MPI communication buffers should be allocated in device memory */
bool m_comms_buffer_on_gpu = false;
/** How many slices of beam particles can be received in advance */
int m_comms_buffer_max_leading_slices = std::numeric_limits<int>::max();
/** How many slices of beam particles can be stored before being sent */
int m_comms_buffer_max_trailing_slices = std::numeric_limits<int>::max();

private:

Expand Down
20 changes: 6 additions & 14 deletions src/Hipace.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -140,19 +140,14 @@ Hipace::Hipace () :
#endif

queryWithParser(pph, "background_density_SI", m_background_density_SI);
queryWithParser(pph, "comms_buffer_on_gpu", m_comms_buffer_on_gpu);
queryWithParser(pph, "comms_buffer_max_leading_slices", m_comms_buffer_max_leading_slices);
queryWithParser(pph, "comms_buffer_max_trailing_slices", m_comms_buffer_max_trailing_slices);
DeprecatedInput("hipace", "comms_buffer_on_gpu", "comms_buffer.on_gpu", "", true);
DeprecatedInput("hipace", "comms_buffer_max_leading_slices",
"comms_buffer.max_leading_slices", "", true);
DeprecatedInput("hipace", "comms_buffer_max_trailing_slices",
"comms_buffer.max_trailing_slices)", "", true);

MakeGeometry();

AMREX_ALWAYS_ASSERT_WITH_MESSAGE(
((double(m_comms_buffer_max_trailing_slices)
* amrex::ParallelDescriptor::NProcs()) > m_3D_geom[0].Domain().length(2))
|| (m_max_step < amrex::ParallelDescriptor::NProcs()),
"comms_buffer_max_trailing_slices must be large enough"
" to distribute all slices between all ranks if there are more timesteps than ranks");

m_use_laser = m_multi_laser.m_use_laser;

queryWithParser(pph, "collisions", m_collision_names);
Expand Down Expand Up @@ -211,11 +206,8 @@ Hipace::InitData ()

m_multi_buffer.initialize(m_3D_geom[0].Domain().length(2),
m_multi_beam.get_nbeams(),
!m_comms_buffer_on_gpu,
m_use_laser,
m_use_laser ? m_multi_laser.getSlices()[0].box() : amrex::Box{},
m_comms_buffer_max_leading_slices,
m_comms_buffer_max_trailing_slices);
m_use_laser ? m_multi_laser.getSlices()[0].box() : amrex::Box{});

amrex::ParmParse pph("hipace");
bool do_output_input = false;
Expand Down
2 changes: 1 addition & 1 deletion src/laser/MultiLaser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ MultiLaser::ReadParameters ()
if (!m_laser_from_file) {
getWithParser(pp, "lambda0", m_lambda0);
}
DeprecatedInput("lasers", "3d_on_host", "hipace.comms_buffer_on_gpu", "", true);
DeprecatedInput("lasers", "3d_on_host", "comms_buffer.on_gpu", "", true);
queryWithParser(pp, "use_phase", m_use_phase);
queryWithParser(pp, "solver_type", m_solver_type);
AMREX_ALWAYS_ASSERT(m_solver_type == "multigrid" || m_solver_type == "fft");
Expand Down
8 changes: 5 additions & 3 deletions src/utils/MultiBuffer.H
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,7 @@ class MultiBuffer
public:

// initialize MultiBuffer and open initial receive requests
void initialize (int nslices, int nbeams, bool buffer_on_host, bool use_laser,
amrex::Box laser_box, int max_leading_slices, int max_trailing_slices);
void initialize (int nslices, int nbeams, bool use_laser, amrex::Box laser_box);

// receive data from previous rank and unpack it into MultiBeam and MultiLaser
void get_data (int slice, MultiBeam& beams, MultiLaser& laser, int beam_slice);
Expand Down Expand Up @@ -107,13 +106,16 @@ private:
MPI_Comm m_comm = MPI_COMM_NULL;

// general parameters
bool m_buffer_on_host = true;
/** Whether MPI communication buffers should be allocated in device memory */
bool m_buffer_on_gpu = false;
int m_nslices = 0;
int m_nbeams = 0;
bool m_use_laser = false;
int m_laser_ncomp = 4;
amrex::Box m_laser_slice_box {};
/** How many slices of beam particles can be received in advance */
int m_max_leading_slices = std::numeric_limits<int>::max();
/** How many slices of beam particles can be stored before being sent */
int m_max_trailing_slices = std::numeric_limits<int>::max();

// parameters to send physical time
Expand Down
21 changes: 14 additions & 7 deletions src/utils/MultiBuffer.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
#include "MultiBuffer.H"
#include "Hipace.H"
#include "HipaceProfilerWrapper.H"
#include "Parser.H"


std::size_t MultiBuffer::get_metadata_size () {
Expand All @@ -24,7 +25,7 @@ std::size_t* MultiBuffer::get_metadata_location (int slice) {

void MultiBuffer::allocate_buffer (int slice) {
AMREX_ALWAYS_ASSERT(m_datanodes[slice].m_location == memory_location::nowhere);
if (m_buffer_on_host) {
if (!m_buffer_on_gpu) {
m_datanodes[slice].m_buffer = reinterpret_cast<char*>(amrex::The_Pinned_Arena()->alloc(
m_datanodes[slice].m_buffer_size * sizeof(storage_type)
));
Expand All @@ -49,17 +50,16 @@ void MultiBuffer::free_buffer (int slice) {
m_datanodes[slice].m_buffer_size = 0;
}

void MultiBuffer::initialize (int nslices, int nbeams, bool buffer_on_host, bool use_laser,
amrex::Box laser_box, int max_leading_slices,
int max_trailing_slices) {
void MultiBuffer::initialize (int nslices, int nbeams, bool use_laser, amrex::Box laser_box) {

amrex::ParmParse pp("comms_buffer");

m_comm = amrex::ParallelDescriptor::Communicator();
const int rank_id = amrex::ParallelDescriptor::MyProc();
const int n_ranks = amrex::ParallelDescriptor::NProcs();

m_nslices = nslices;
m_nbeams = nbeams;
m_buffer_on_host = buffer_on_host;
m_use_laser = use_laser;
m_laser_slice_box = laser_box;

Expand All @@ -73,8 +73,15 @@ void MultiBuffer::initialize (int nslices, int nbeams, bool buffer_on_host, bool
m_tag_buffer_start = 1;
m_tag_metadata_start = m_tag_buffer_start + m_nslices;

m_max_leading_slices = max_leading_slices;
m_max_trailing_slices = max_trailing_slices;
queryWithParser(pp, "on_gpu", m_buffer_on_gpu);
queryWithParser(pp, "max_leading_slices", m_max_leading_slices);
queryWithParser(pp, "max_trailing_slices", m_max_trailing_slices);

AMREX_ALWAYS_ASSERT_WITH_MESSAGE(
((double(m_max_trailing_slices) * n_ranks) > nslices)
|| (Hipace::m_max_step < amrex::ParallelDescriptor::NProcs()),
"comms_buffer.max_trailing_slices must be large enough"
" to distribute all slices between all ranks if there are more timesteps than ranks");

for (int p = 0; p < comm_progress::nprogress; ++p) {
m_async_metadata_slice[p] = m_nslices - 1;
Expand Down

0 comments on commit f7c9b1a

Please sign in to comment.