-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Nonblocking Collectives #456
base: main
Are you sure you want to change the base?
Conversation
2683ab3
to
a77c345
Compare
Like data exchange semantics, the entry and completion | ||
criteria of blocking and nonblocking alltoall is similar. | ||
|
||
{\bf Entry criteria}: Before any \ac{PE} calls a \FUNC{shmem\_alltoall\_nb} routine, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This entry criteria is confusing - if we have the same entry criteria as blocking operations - then the users are supposed to make sure the src/dst buffers are available at the point of entry - meaning the users are expected to sync with themselves before launching the nb collective operation.
AFAIU - the src/dst needs to be available when everyone in the team reaches the state - which can be determined during runtime - without the need for explicit sync before launching the nb collective operation.
Please clarify the intended semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naveen, my intent is to keep it consistent with the semantics of blocking collectives. I wanted to improve the description. I don’t see how the current wording is violating that. If the wording doesn’t reflect that, then we can fix it.
Feedback from Jim (May Spec Meeting): Should we consider shmem_local_complete ? |
\item Collective Types: The nonblocking variants supported include barrier, alltoall, | ||
and broadcast collectives. Other collective operations such as | ||
reductions, collect, barrier, alltoalls, and sync will not have nonblocking variants. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should sync_all
be supported? My hunch is it seems possibly useful...
Feedback from Feb. 27, 2024 discussion:
|
content/shmem_collective_wait.tex
Outdated
threads but on different request objects. It is the responsibility of the | ||
\openshmem user to ensure that proper synchronization is used to prevent race | ||
conditions or deadlock. Specifically, the \FUNC{shmem\_req\_wait} operation should | ||
be called after the collective operation to ensure that the request object is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the requests are always deallocated in wait
or test
call, and if request handles are not reused, it will not run into the scenario of "premature deallocation".
content/shmem_collective_test.tex
Outdated
\apidescription{ | ||
A call to \FUNC{shmem\_req\_test} returns immediately. If the | ||
operation identified by the request is completed, it returns | ||
zero, and the request object is deallocated and cannot be reused. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could request objects have value SHMEM_REQUEST_INVALID
after they are deallocated? In which case this routine and shmem_req_wait
would take an argument of type shmem_req_h *
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidozog Just updated. Please let me know if it's missing something.
Notes from today's discussion -- 06-27-2024 -- NBI Collectives pSync Management.pptx |
Co-authored-by: James Dinan <[email protected]>
Co-authored-by: James Dinan <[email protected]>
Co-authored-by: James Dinan <[email protected]>
Co-authored-by: James Dinan <[email protected]>
Co-authored-by: James Dinan <[email protected]>
80d70be
to
e40b263
Compare
Add SHMEM_REQ_INVALID library constant Address WG Feedback Clarify collective start/execution Remove barrier_all_nb Address Feedback Minor update to nb_collective_intro.tex Clarify completion of broadcast_nb
\item Completion semantics: \openshmem programs can learn the status of the collective operations | ||
using the \FUNC{shmem\_req\_test} routine. The operation is completed after | ||
a call to \FUNC{shmem\_req\_test} or a call to \FUNC{shmem\_req\_wait}. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to what @kwaters4 mentioned. Add something like "Completion of the operation can be observed through one or more calls to \FUNC{shmem_req_test} or a single call to \FUNC{shmem_req_wait}."
@@ -0,0 +1,123 @@ | |||
\apisummary{ | |||
Exchanges a fixed amount of contiguous data blocks between all pairs | |||
of \acp{PE} participating in the collective routine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we mention later that "All PEs in the provided team must participate in the collective.", can we just say that here, instead of keeping it in the abstract?
i.e. "Exchanges a fixed amount of contiguous data blocks between all PEs on the specified team."
\item The \source{} data object may be safely reused. | ||
\item The \dest{} data object is updated. | ||
\item If the local \ac{PE} is \VAR{PE\_root}, the data has been copied | ||
out of the \source{} data object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this change to a separate PR and ask section committee to add it. @davidozog
This proposal was brought forward for a vote on August 23, 2024 and the ballot did not pass. The proposal will need to be re-read before it can be voted on again. |
@@ -0,0 +1,38 @@ | |||
\apisummary{ | |||
The routine waits until a operation identified by a request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "an operation" instead of "a operation"
Capturing feedback from specification meeting:
|
No description provided.