Threading Model by PhilMiller · Pull Request #388 · kokkos/kokkos-core-wiki

PhilMiller · 2023-05-24T00:33:31Z

Document what semantics we actually have around use of multiple threads calling Kokkos

The foundational principles I think we have are that

We should build on the C++ memory model, since we're directly interacting with it at points (particularly, View::operator() from host, and equivalent memory access in buffers that we deep_copy to/from)
An execution space instance is very similar to a C++ thread in terms of relevant semantics
All the stuff Kokkos does can be decomposed in terms of a few fundamental operations that play key semantic roles

PhilMiller · 2023-05-24T00:42:41Z

Tagged @msimberg because the HPX backend exposes fairly unique behaviors that I definitely don't understand, so it will probably require its own treatment here.

Tagged @keitaTN and @nmm0 because of their work (as I recall it) on describing an operational semantics for Kokkos

masterleinad

I think the most interesting question to answer still is if we serialize kernels on all backends for the same execution space instance or not.

masterleinad · 2023-05-24T12:52:48Z

+
+A multi-threaded program structured such that there is a *happens-before* relationship between each call to perform a *Fundamental Operation* will behave equivalently to a single-threaded program that performs the same sequence of *Fundamental Operations*. (Note: This is analogous to ``MPI_THREAD_SERIALIZED``)
+
+.. Do we actually want to guarantee that every Fundamental Operation is serializing? Should that just mean that we don't require call sites to have *happens-before* relationships, or should they also internally create such *happens-before* relationships? I.e. that the calling threads *synchronize-with* each other at those points?


That's a key question. My understanding is that we want to serialize parallel dispatch to the same execution space instance but I don't think we want to promise anything with respect to data access outside of kernels.

masterleinad · 2023-05-24T14:06:40Z

+
+*Global Synchronization* creates a *happens-before* relationship between the completion of every *Fundamental Operation* on any *Execution Space Instance* that *happens-before* the *Global Synchronization* and the thread that performs the *Global Synchronization*.
+
+.. Should the above actually be *synchronizes-with*?


Is there really much of a difference when we talk about fence?

masterleinad · 2023-05-24T14:12:53Z

+
+* Managed Construction
+  Managed construction of a Kokkos View performs a *Memory Allocation*, potentially followed by a *Parallel Dispatch* to initialize the memory (depending on whether ``WithoutInitializing`` was passed), potentially followed by a *Synchronization* (if no execution space instance was passed, so that allocation and initialization *happen-before* any subsequent operation that may reference the ``View``'s memory').
+  .. Do we want that to be *Global Synchronization* or *Local Synchronization*?


We effectively do a device-wide (or at least execution space-wide) synchronization at the moment, see https://github.com/kokkos/kokkos/blob/5d81422daea73f5a2a69771cc0dfafc19f785003/core/src/Cuda/Kokkos_CudaSpace.cpp#L160-L205. The intent is to make sure that memory can't be accessed before allocation is complete and thus it should be (IMHO) enough to fence the active execution space instance on the current thread.

masterleinad · 2023-05-25T14:05:04Z

+* *Initialization*
+
+.. Not just Kokkos::init, but also whatever device-specific or thread-specific stuff we have Legion doing now
+
+* *Finalization*
+
+.. Ditto Initialization


Backends can still only be initialized or finalized once. I'm not quite sure if it's worth mentioning initialization/finalization then. At the very least, we need to clarify what we mean here (execution space instance initialization/finalization maybe sensible).

masterleinad · 2023-05-25T14:05:36Z

+* *Data Access*
+  ``View::operator()``, to memory that is accessible from the host.
+


Not quite sure if we want to promise anything about data access outside of kernels.

I think we have to, or else we can't suitably address either usage of unmanaged views, or UVM

masterleinad · 2023-05-25T14:07:29Z

+* Metadata Query
+* Element Access
+  Element Access performs a Data Access operation.


Not quite sure if we need these.

masterleinad · 2023-05-25T14:14:25Z

+Backend-Specific Details
+------------------------
+
+.. Local or Global synchronizations below?


It might be enough to group backends into synchronous and asynchronous backends clarifying that kernels submitted by multiple kernels are serialized (if we decide to make that promise).

masterleinad · 2023-05-25T14:17:17Z

+* ``CUDA`` and ``HIP``
+
+* ``HPX``
+


We should talk more about parallel dispatch and the behavior of independent threads (without a happens-before relationship between them) accessing the same data.
Possibly also clarifying where we promise that dispatch implies fences (linking to API for parallel_for, parallel_reduce, parallel_scan).

ajpowelsnl · 2024-04-01T21:10:45Z

TODO:

Champions: @masterleinad , @ajpowelsnl
Request reviews for:
- #4385
- #6051
- #6151
Discuss Kokkos "thread safety" concepts with team
Decide where to document "thread safety" behavior , using this PR as a starting point

Very rough draft

1ff2d27

PhilMiller requested review from crtrott and dalg24 May 24, 2023 00:40

PhilMiller assigned msimberg and masterleinad and unassigned msimberg and masterleinad May 24, 2023

PhilMiller requested review from keitaTN, masterleinad, msimberg, nliber and nmm0 May 24, 2023 00:40

masterleinad reviewed May 25, 2023

View reviewed changes

cz4rs self-requested a review June 23, 2023 07:42

masterleinad mentioned this pull request Aug 7, 2023

Allow "parallel dispatch" to Serial execution space even within a parallel kernel kokkos/kokkos#223

Open

ajpowelsnl assigned ajpowelsnl and masterleinad Apr 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Threading Model#388

Threading Model#388
PhilMiller wants to merge 1 commit into
kokkos:mainfrom
PhilMiller:threading

PhilMiller commented May 24, 2023 •

edited

Loading

Uh oh!

PhilMiller commented May 24, 2023

Uh oh!

masterleinad left a comment

Uh oh!

masterleinad May 24, 2023

Uh oh!

masterleinad May 24, 2023

Uh oh!

masterleinad May 24, 2023

Uh oh!

masterleinad May 25, 2023

Uh oh!

masterleinad May 25, 2023

Uh oh!

PhilMiller Jun 7, 2023

Uh oh!

masterleinad May 25, 2023

Uh oh!

masterleinad May 25, 2023

Uh oh!

masterleinad May 25, 2023 •

edited by PhilMiller

Loading

Uh oh!

ajpowelsnl commented Apr 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		A multi-threaded program structured such that there is a happens-before relationship between each call to perform a Fundamental Operation will behave equivalently to a single-threaded program that performs the same sequence of Fundamental Operations. (Note: This is analogous to ``MPI_THREAD_SERIALIZED``)

		.. Do we actually want to guarantee that every Fundamental Operation is serializing? Should that just mean that we don't require call sites to have happens-before relationships, or should they also internally create such happens-before relationships? I.e. that the calling threads synchronize-with each other at those points?


		Global Synchronization creates a happens-before relationship between the completion of every Fundamental Operation on any Execution Space Instance that happens-before the Global Synchronization and the thread that performs the Global Synchronization.

		.. Should the above actually be synchronizes-with?

		* Data Access
		``View::operator()``, to memory that is accessible from the host.

Conversation

PhilMiller commented May 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

PhilMiller commented May 24, 2023

Uh oh!

masterleinad left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

masterleinad May 25, 2023 • edited by PhilMiller Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ajpowelsnl commented Apr 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

PhilMiller commented May 24, 2023 •

edited

Loading

masterleinad May 25, 2023 •

edited by PhilMiller

Loading