You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The problem is initially seen in dynamic process management, but also extend to other communicator functions that have no clear spanning parent communicator. In these operations, validating the correct creation of the new communicator is difficult, because there is no readily available communicator on which to AGREE about the success of the operation.
The problem also extends (to a lesser degree) to COMM_CREATE_GROUP, because the only way to achieve the result as of today is to AGREE on the entire parent comm, thereby defeating the localized semantic of this operation (making it in essence the same as a SPLIT).
In the case of dynamic process management functions, we have solved this problem by adding supplementary semantics about the consistency of return codes between the roots in CONNECT/ACCEPT and about the capability of communicating to/from the root after SPAWN. This permits agreeing between the two separate "worlds" by communicating directly on the newly created intercomm.
The problem comes from the fact that both disconnected worlds have to know "a-priori" who is the root on the other side, and that's difficult to obtain from MPI constructs: could appear in the port name (or not: implementation dependent), could be passed as a command line parameter in spawn, could be semantic from the program code, etc, but overall, this is not very satisfactory.
This issue proposes to think about a better way to validate communicator creation, so that all these cases work w/o root, in a scalable way for group creations, and possibly lead to simpler user code for normal intracomm creations.
The rough pitch of a possible solution is to create all communicators as "auto-revoke on failure". Users would have a way to (local operation) unset the autorevoke flag, after they have validated the communicator. As long as it remains autorevoke, communications may be more expensive, but it is expected that they will never deadlock, even when the communicator has not been properly created due to a process failure. When the user could verify that the communicator is indeed in a good state, they can remove that autorevoke feature, and benefit from faster communications.
We eventually have come to the conclusion that we will require dynamic communicator creation to perform an agreement and return uniformly. Obviously, this is more expensive, but this operation is already very expensive and it might not make much difference to add this cost.
The text was updated successfully, but these errors were encountered:
The problem is initially seen in dynamic process management, but also extend to other communicator functions that have no clear spanning parent communicator. In these operations, validating the correct creation of the new communicator is difficult, because there is no readily available communicator on which to AGREE about the success of the operation.
The problem also extends (to a lesser degree) to COMM_CREATE_GROUP, because the only way to achieve the result as of today is to AGREE on the entire parent comm, thereby defeating the localized semantic of this operation (making it in essence the same as a SPLIT).
In the case of dynamic process management functions, we have solved this problem by adding supplementary semantics about the consistency of return codes between the roots in CONNECT/ACCEPT and about the capability of communicating to/from the root after SPAWN. This permits agreeing between the two separate "worlds" by communicating directly on the newly created intercomm.
The problem comes from the fact that both disconnected worlds have to know "a-priori" who is the root on the other side, and that's difficult to obtain from MPI constructs: could appear in the port name (or not: implementation dependent), could be passed as a command line parameter in spawn, could be semantic from the program code, etc, but overall, this is not very satisfactory.
This issue proposes to think about a better way to validate communicator creation, so that all these cases work w/o root, in a scalable way for group creations, and possibly lead to simpler user code for normal intracomm creations.
The rough pitch of a possible solution is to create all communicators as "auto-revoke on failure". Users would have a way to (local operation) unset the autorevoke flag, after they have validated the communicator. As long as it remains autorevoke, communications may be more expensive, but it is expected that they will never deadlock, even when the communicator has not been properly created due to a process failure. When the user could verify that the communicator is indeed in a good state, they can remove that autorevoke feature, and benefit from faster communications.
History: https://bitbucket.org/bosilca/mpi3ft/issues/15/no-reliance-on-roots-to-agree-on-intercomm
We eventually have come to the conclusion that we will require dynamic communicator creation to perform an agreement and return uniformly. Obviously, this is more expensive, but this operation is already very expensive and it might not make much difference to add this cost.
The text was updated successfully, but these errors were encountered: