Virtual id refactoring #356

aeblyve · 2023-09-01T20:00:22Z

Intro

This is a branch of the dev/gc00/improve-virtualids code intended for longer-term maintainability, review, and integration into main MANA. It is also intended to be the base for OpenMPI, ExaMPI, etc. development.

What's different from dev/gc00/improve-virtualids?

I integrated a critical bugfix in REMOVE_OLD, which we had applied in our local {Exa,Open}MPI development branches on NEU Discovery into this branch. This bug did not manifest with MPICH, but it made it impossible to free anything with OpenMPI, ExaMPI. (OpenMPI still needs to run with TCP enabled to prevent rdma_frag assertion error)
The commit count is much more reasonable. I sincerely apologize for the inconvenience of wrestling with literally hundreds of commits and will take care not to do this again -- I was taught by others in the past that commits should be used as though they are free and are in fact a reasonable way to save changes when in a development branch. This is not true if coherent history under rebasing is a goal.
ggids are no longer granted unconditionally. This caused problems when we created an internal communicator and it was granted a ggid (PRESUSPEND bug). I had solved the problem with a hack get_vcomm_internal before, this is a better solution because the semantics match the old style. See grant_ggid and its usages.
dev/gc00/improve-virtualids had slightly different ADD_NEW semantics than main, because the equivalent of realIdExists(real_id) would be hacky to write. I've written up how this realIdExists functionality might look like, but without a two-level table, it is very bad for big apps, because we cannot iterate through only the descriptors of a single type. Please look at this part, @xuyao0127. If we cannot develop and perfect a two-level table like this by camera-ready for SC23, we could revert to the old behavior. I don't think it caused problems, but it is different. Alternatively, as a workaround we could maintain several maps for the different types as before (but do away with string comparison and template class still).
I reworked and reformatted some commentary. FIXME annotates points of concern.
I removed some documentation by @rajatpratapbisht , because these should be added in a dedicated PR (I can add this later).

Otherwise, the branches are the same.

Please, read over the FIXME comments I have written in reviewing, they contain points of interest and important questions/concerns.

Contrib

Unfortunately, this is a pretty big PR. I tried to organize the commits so it is clear what commit is in which phase.

This PR does three main things:

1. Replace the old virtualid machinery

(map from ints to ints, written in a C++ template and using dmtcp's virtualidtable.h) to a mapping from int to struct pointer, with each struct pointer containing both the Real ID for a some Virtual ID, and any metadata information required. I tried to write this part in a C-style. The ugliest part has to be the macros, but I thought that this was the cleanest way to write the code which depends on the particular MPI types, because they are not defined the same way even in the same implementation. For instance, MPI_File is a pointer even in MPICH. It also integrates well with the existing macros. Like @xuyao0127 says, the definition and usage of these macros are kind of strange and obtuse, or at least poorly documented (VIRTUAL_TO_REAL(real) == real). However, changing their behavior would mean changing them literally everywhere, making this PR even bigger. It could be a later project.

The struct pointer architecture enables objectives 2 and 3 of the PR. It's also the part that enables portability between MPI implementations (with the exception of one FIXME comment in init_comm_world: @xuyao0127 , @JainTwinkle , I would appreciate your feedback/work here. I think it would be great if the MANA codebase was truly 100% the same between ExaMPI, OpenMPI, MPICH. Therefore, we need to integrate the lh_constants_map into every version.

If we only want to demonstrate the capability of checkpoint-restart with multiple MPI (without also improving performance in points 2, 3 and really flexing struct VIDs) this is all we have to test and enhance, just the first commit here.

2. Integrate the old ggid machinery into the system in point 1.

This means that we don't have to look up the same virtual id multiple times. An optimization.

3. Replace the log-replay architecture with a decode-recode architecture.

This works readily for communicators, groups, and operators, because they can all be constructed in one go, when given the proper arguments. It is more difficult for datatypes, because datatypes cannot be "fully deserialized" in the same way. While operators also cannot be deserialized, it is not a problem because there is only way to create a new operator, and operators are not recursively defined, so it is trivial to just hook into operator creation. Datatypes that are "doubly-derived", i.e., user datatypes built using other user datatypes, do not immediately break down into MPI primitives.

One way to solve this problem would be to maintain a dependency tree of datatypes: every wrapper that creates a datatype can record the virtual id, and mark that it is a dependency. Another way may be to implement our own full-deserialization, but this is probably technically difficult. Yet another is to preserve the log-replay system for datatypes exclusively. Either way, the point of this part is to enable faster runtime (we don't need to record MPI calls), faster restart (we don't have to replay MPI calls) at the cost of making checkpoint time slightly slower (we have to record the information for everything that's alive at checkpoint time).

Future

Where to go from here:

Resolve the FIXME (answer the questions and/or change the code)
Implement the two-level table
Integrate lh_constants_map
Rip out all of the LOG_CALL when we think decon-recon is good. (right now, restoreMpiLogState is disabled, but logging itself is not)
Whatever else the reviewers think is fit.

aeblyve · 2023-09-01T21:09:55Z

Could we trigger more involved automated testing?

gc00 · 2023-09-01T21:18:06Z

Yes, More involved unit testing would be nice.
Unfortunately, the academic prototype that we inherited already had plenty of design flaws We have to finish fixing the design flaws before moving on to better unit tests.

aeblyve · 2023-09-01T21:21:27Z

Okay, then the other goal is to use this branch as a better base for exampi and openmpi support. @xuyao0127 @JainTwinkle should add lh_constants_map to this branch and make other changes as needed.

aeblyve added 7 commits August 31, 2023 17:01

initial virtual-id refactoring

1d639b3

integrate ggid machinery into vid machinery

0274504

ggid machinery integration: FIXME commentary

c681911

fix logic bug: updated_comm -> updated_comm_ggid

13c9de0

record-replay -> decode-recode

99b29df

actually remove old descriptors when RID is freed

7ae5022

eliminate one lookup in seq_num_broadcast

abd7ef7

aeblyve requested review from JainTwinkle, gc00 and xuyao0127 September 1, 2023 20:00

aeblyve marked this pull request as ready for review September 1, 2023 20:12

aeblyve added 2 commits September 1, 2023 16:29

fixup descriptor deletion

39589d8

CRIPPLE unit-test Makefile, for now.

567dbd1

aeblyve force-pushed the virtualids-v2 branch from 64562f5 to 567dbd1 Compare September 18, 2023 23:49

aeblyve added 8 commits September 18, 2023 19:52

Remove all LOG_CALL

7310355

REMOVED record-replay functions

8836656

Added fortran conditional macros, for ExaMPI. But no makefile yet.

8550644

Simplify Type_vector wrapper

9aae6fe

Added MPI-naive lh_constants_map

bd55409

Added usage of REAL_CONSTANT as required

3275bc7

libproxy.c fixup

87b8d7c

mpi_constants fixup

eeeb8a8

aeblyve force-pushed the virtualids-v2 branch 2 times, most recently from b84a6b3 to eeeb8a8 Compare September 19, 2023 04:10

init_lh_constants_map in Init, Init_thread

0358def

aeblyve force-pushed the virtualids-v2 branch from e9a56c5 to 0358def Compare September 19, 2023 06:50

REAL_CONSTANT in vids

37815cf

aeblyve added 7 commits September 19, 2023 20:18

Programmatic per-MPI virtual-ids.cpp definition

252a07a

Initial makefile conditional per-MPI

f673093

Fix a type problem a-la MPI_File

8e19721

Improved conditional makefile

a8fc316

Conditional stub generation, fixed greps

35cf6d0

REVERT conditional mpi_stub_wrappers.

6ca98ce

Conditional fortran wrappers

2dba7c4

aeblyve force-pushed the virtualids-v2 branch from 6d384c4 to 2dba7c4 Compare September 21, 2023 02:28

aeblyve added 7 commits September 20, 2023 22:32

MPI-specific wrapper files

796f5c9

MPI-specific fortran wrappers

e8dfb6d

Remove extra tokens at endif directive

e417529

Fix trailing spaces

75f4f2e

Fix grep

7566a80

Fixup multiline

24a86b9

Change error to exit

4f1a368

aeblyve force-pushed the virtualids-v2 branch from 95229dd to 4f1a368 Compare September 21, 2023 04:14

aeblyve added 2 commits September 21, 2023 00:18

Fix fortran makefile

dd2d5ed

Conditional libproxy.h using header file

5aacf1f

aeblyve force-pushed the virtualids-v2 branch from 670cbd3 to 5aacf1f Compare September 21, 2023 04:34

aeblyve added 10 commits September 21, 2023 00:42

Include mpi.h

1ef684f

Conditional cart_wrappers

bcde8ea

Condition collective_wrappers

4a3254e

Conditional comm_wrappers

7166f15

Conditional op_wrappers

a840f18

Conditional p2p_wrappers

1d88699

Conditional request wrappers

8ce2cb2

Conditional type wrappers

9017493

Conditional general wrappers

5ef4a46

Conditional mpi_plugin

71caad2

gc00 force-pushed the main branch from 83520b9 to 445210c Compare September 23, 2023 23:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Virtual id refactoring #356

Virtual id refactoring #356

Uh oh!

aeblyve commented Sep 1, 2023 •

edited

Loading

Uh oh!

aeblyve commented Sep 1, 2023

Uh oh!

gc00 commented Sep 1, 2023

Uh oh!

aeblyve commented Sep 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Virtual id refactoring #356

Are you sure you want to change the base?

Virtual id refactoring #356

Uh oh!

Conversation

aeblyve commented Sep 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Intro

What's different from dev/gc00/improve-virtualids?

Contrib

1. Replace the old virtualid machinery

2. Integrate the old ggid machinery into the system in point 1.

3. Replace the log-replay architecture with a decode-recode architecture.

Future

Uh oh!

aeblyve commented Sep 1, 2023

Uh oh!

gc00 commented Sep 1, 2023

Uh oh!

aeblyve commented Sep 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aeblyve commented Sep 1, 2023 •

edited

Loading