Skip to content

Commit 1315332

Browse files
rhc54hppritcha
authored andcommitted
Check the runtime version of PMIx
It has been reported (and confirmed) that building against one version of PMIx and then running with another version will cause PRRTE to segfault. This isn't a universal rule. For example, one can switch v5.0 and master without a problem. However, switching v5.0 and v4.2 is a definite segfault. The root cause of the problem is a change in the layout of the base pmix_object_t definition. This renders all PMIx objects binary incompatible when crossing between the v5 and v4 (and below) series. Changing the v5 definition back to match v4 is an overly complex task. The changes were required to accommodate the new shared memory support that was introduced in v5. So instead, we check the runtime version of PMIx against the build version. If the runtime version is incompatible with the build version, then we print an explanatory error message and error out. Signed-off-by: Ralph Castain <rhc@pmix.org> bot:notacherrypick dd Signed-off-by: Ralph Castain <rhc@pmix.org>
1 parent 7693b66 commit 1315332

File tree

1 file changed

+40
-0
lines changed

1 file changed

+40
-0
lines changed

src/runtime/prte_init.c

Lines changed: 40 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,9 @@
3838
#ifdef HAVE_SYS_STAT_H
3939
# include <sys/stat.h>
4040
#endif
41+
#ifdef HAVE_STRING_H
42+
#include <string.h>
43+
#endif
4144

4245
#include "src/util/error.h"
4346
#include "src/util/error_strings.h"
@@ -127,16 +130,53 @@ static bool check_exist(char *path)
127130
return false;
128131
}
129132

133+
static void print_error(unsigned major,
134+
unsigned minor,
135+
unsigned release)
136+
{
137+
fprintf(stderr, "************************************************\n");
138+
fprintf(stderr, "We have detected that the runtime version\n");
139+
fprintf(stderr, "of the PMIx library we were given is binary\n");
140+
fprintf(stderr, "incompatible with the version we were built against:\n\n");
141+
fprintf(stderr, " Runtime: 0x%x%02x%02x\n", major, minor, release);
142+
fprintf(stderr, " Build: 0x%0x\n\n", PMIX_NUMERIC_VERSION);
143+
fprintf(stderr, "Please update your LD_LIBRARY_PATH to point\n");
144+
fprintf(stderr, "us to the same PMIx version used to build PRRTE.\n");
145+
fprintf(stderr, "************************************************\n");
146+
}
147+
130148
int prte_init_minimum(void)
131149
{
132150
int ret;
133151
char *path = NULL;
152+
const char *rvers;
153+
char token[100];
154+
unsigned int major, minor, release;
134155

135156
if (min_initialized) {
136157
return PRTE_SUCCESS;
137158
}
138159
min_initialized = true;
139160

161+
/* check to see if the version of PMIx we were given in the
162+
* library path matches the version we were built against.
163+
* Because we are using PMIx internals, we cannot support
164+
* cross version operations from inside of PRRTE.
165+
*/
166+
rvers = PMIx_Get_version();
167+
ret = sscanf(rvers, "%s %u.%u.%u", token, &major, &minor, &release);
168+
169+
/* check the version triplet - we know that version
170+
* 5 and above are not runtime compatible with version
171+
* 4 and below. Since PRRTE has a minimum PMIx requirement
172+
* in the v4.x series, we only need to check v4 vs 5
173+
* and above */
174+
if ((PMIX_VERSION_MAJOR > 4 && 4 == major) ||
175+
(PMIX_VERSION_MAJOR == 4 && 5 <= major)) {
176+
print_error(major, minor, release);
177+
return PRTE_ERR_SILENT;
178+
}
179+
140180
/* carry across the toolname */
141181
pmix_tool_basename = prte_tool_basename;
142182

0 commit comments

Comments
 (0)