-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable MPI time statistics in Ember #2453
base: devel
Are you sure you want to change the base?
Conversation
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
This is a great fix for the MPI statistics issue, the per motif option is a really nice touch! The moving to a configure function is ok and moves more in to the direction that SHMEM motifs are in which you only know the job size once inside of the generate function - this should be reflected in the SST documentation. I built and compiled the issue-fix branch and noted two issues: Issue 1:All motifs throw this warning when compiling on my MacBook: Issue 2:If running multiple motifs within an endpoint the statistics are joined despite being run as separate motif entries, consider the following:
The resulting statistics file doesn't show separate entries for the motifs, perhaps append the index of that motif in the SubStatisticId.
|
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
We've been planning to update the stats APIs in the core after the 15 release. One of the plans is to enable dynamic stats creation in the output types where it makes sense. We have a current outstanding stats-related PR and will add the dynamic loading capabilities to some of the output types before releasing it for testing. Let's wait to see what this enables for you before we change ember to load all subcomponents at construct time. We should be able to have this for you within a day or so, or find out that it's not as straightforward as we had hoped, in which case we can take a deeper look at this PR. |
Sure, that makes sense. Thanks for sharing the update! |
This PR aims to bring back to Ember its capability of measuring and reporting as statistics the time being spent in different MPI events, by different MPI motifs.
Link to the related issue: #2043
In the current design of Ember, motifs are created as anonymous subcomponents during simulation runtime. I.e., once the simulation of a particular motif completes, the next motif as specified in the config file gets loaded as an anonymous subcomponent. See the calls to initMotif() in ember/emberengine.cc.
One of the consequences of the dynamic motif loading is that registering any statistics defined by the motifs requires the StatisticOutput SST object to support the dynamic statistic registration (c.f., supportsDynamicRegistration() in statapi/statoutput.h). It seems that none of the currently available statistic output formats support dynamic statistics registration.
The first commit in this PR (Ember: create all motifs during simulation startup) addresses this issue by constructing all defined motifs in advance, i.e., in EmberEngine constructor.
Note that some of the Ember motifs to be fully initialized require another motif to be executed first. One example is the OTF2 motif, which requires the Init motif to initialize the EmberMpiLib object. To facilitate this, a virtual configure() method has been added to the base EmberGenerator class. Some of the motifs (e.g., Ember3DAMRGenerator) already had a configure() function, called in either their constructor or the generate() function. The configure() function is now called explicitly by the EmberEngine, notably in the places where the initMotif() has been previously called.
The second commit (Ember: Enable the MPI time statistics) enables the MPI time statistics in the motifs. The SST_ELI declarations of the MPI event stats are removed from all the derived motif classes and are now part only of the EmberMessagePassingGenerator. The enumeration of the MPI events has been moved from emberMpiLib.h to emberMPIEvent.h. Motifs use their name (e.g., InitMotif) as a statistic subId, so that different motifs may create separate sets of stats.
The EmberMpiLib object has been extended to hold a copy of the pointers to statistics created by the currently simulated motif. See setEventStatistics() in ember/libs/emberMpiLib.cc. Each MPI motif calls setEventStatistics() in its configure() function. The copy of stat pointers in EmberMpiLib is cleared after completion of each motif.
A new Ember config parameter 'enableMpiStatsPerMotif' has been added, enabled by default, which allows to create only one set of statistics (per EmberGenerator), by specifying enableMpiStatsPerMotif=0 in the config file.
For example:
ep = EmberMPIJob(0, topo.getNumNodes())
ep.addMotif("Init")
ep.addMotif("Fini")
ep.ember.enableMpiStatsPerMotif=0
With enableMpiStatsPerMotif=0
nic0core0_EmberEP, time-Init, , Accumulator, 298688773, 0, 60, 3600, 1, 60, 60
Default:
nic0core0_EmberEP, time-Init, InitMotif, Accumulator, 298688773, 0, 60, 3600, 1, 60, 60