diff --git a/P0009/wg21/data/index.yaml b/P0009/wg21/data/index.yaml index f321068a..642f06aa 100644 --- a/P0009/wg21/data/index.yaml +++ b/P0009/wg21/data/index.yaml @@ -83181,6 +83181,14 @@ references: issued: year: 2019 URL: https://wg21.link/p0009r9 + - id: P0009R18 + citation-label: P0009R18 + title: "mdspan" + author: + - family: Christian Trott, D.S. Hollman, Damien Lebrun-Grandie, Mark Hoemmen, Daniel Sunderland, H. Carter Edwards, Bryce Adelstein Lelbach, Mauro Bianco, Ben Sander, Athanasios Iliopoulos, John Michopoulos, Nevin Liber + issued: + year: 2022 + URL: https://wg21.link/p0009r18 - id: P0010R0 citation-label: P0010R0 title: "Adding a subsection for concurrent random number generation in C++17" @@ -104753,6 +104761,14 @@ references: issued: year: 2019 URL: https://wg21.link/p1673r1 + - id: P1673R13 + citation-label: P1673R13 + title: "A free function linear algebra interface based on the BLAS" + author: + - family: Mark Hoemmen, Daisy Hollman, Christian Trott, Daniel Sunderland, Nevin Liber, Alicia Klinvex, Li-Ta Lo, Damien Lebrun-Grandie, Graham Lopez, Peter Caday, Sarah Knepper, Piotr Luszczek, Timothy Costa + issued: + year: 2023 + URL: https://wg21.link/p1673r1 - id: P1674R0 citation-label: P1674R0 title: "Evolving a Standard C++ Linear Algebra Library from the BLAS" @@ -104889,6 +104905,14 @@ references: issued: year: 2019 URL: https://wg21.link/p1684r0 + - id: P1684R5 + citation-label: P1684R5 + title: "mdarray: An Owning Multidimensional Array Analog of mdspan" + author: + - family: Christian Trott, Daisy Hollman, Mark Hoemmen, Daniel Sunderland, Damien Lebrun-Grandie + issued: + year: 2023 + URL: https://wg21.link/p1684r5 - id: P1685R0 citation-label: P1685R0 title: "Make get/set_default_resource replaceable" @@ -107369,6 +107393,14 @@ references: issued: year: 2019 URL: https://wg21.link/p1999r0 + - id: P2630R4 + citation-label: P2630R4 + title: "Submdspan" + author: + - family: Christian Trott, Damien Lebrun-Grandie, Mark Hoemmen, Nevin Liber + issued: + year: 2023 + URL: https://wg21.link/p2630r4 - id: P3141 citation-label: P3141 title: "std::terminates()" diff --git a/mdspan_copy/mdspan_copy.html b/mdspan_copy/mdspan_copy.html new file mode 100644 index 00000000..c58b5ea1 --- /dev/null +++ b/mdspan_copy/mdspan_copy.html @@ -0,0 +1,668 @@ + + + + + + + + Copy and fill for mdspan + + + + + + + + +
+
+

Copy and fill for +mdspan

+ + + + + + + + + + + + + + + + + + +
Document #:
Date: 2024-03-22
Project: Programming Language C++
+
Reply-to: + Nicolas Morales
<>
+ Christian Trott
<>
+ Mark Hoemmen
<>
+ Damien Lebrun-Grandie
<>
+
+ +
+
+ +

1 Motivation

+

C++23 introduced mdspan ([P0009R18]), a nonowning multidmensional +array abstraction that has a customizable layout. Layout customization +was originally motivated in [P0009R18] with considerations for +interoperability and performance, particularly on different +architectures. Moreover, [P2630R4] introduced +submdspan, a slicing function that can yield arbitrarily +strided layouts. However, without standard library support, copying +efficiently between mdspans with mixes of complex layouts is challenging +for users.

+

Many applications, including high-performance computing (HPC), image +processing, computer graphics, etc that benefit from mdspan +also would benefit from basic memory operations provided in standard +algorithms such as copy and fill. Indeed, the authors found that a copy +algorithm would have been quite useful in their implementation of the +copying mdarray ([P1684R5]) constructor. A more +constrained form of copy is also included in the standard +linear algebra library ([P1673R13]).

+

However, existing standard library facilities are not sufficient +here. Currently, mdspan does not have iterators or ranges +that represent the span of the mdspan. Additionally, it’s +not entirely clear what this would entail. +std::linalg::copy ([P1673R13]) is limited to +mdspans of rank 2 or lower.

+

Moreover, the manner in which an mdspan is copied (or +filled) is highly performance sensitive, particularly in regards to +caching behavior when traversing mdspan memory. A naïve user +implementation is easy to get wrong in addition to being tedious for +higher rank mdspans. Ideally, an implementation would be +free to use information about the layout of the mdspan +known at compile time to perform optimizations; e.g. a continuous span +mdspan copy for trivial types could be implementeed with a +memcpy.

+

Finally, providing these generic algorithms would also enable these +operations for types that are representable by mdspan. For +example, this would naturally include mdarray, which is +convertible to mdspan, or for user-defined types whose view +of memory corresponds to mdspans (e.g. an image class or +something similar).

+

1.1 Safety

+

Due to the closed nature of mdspan extents, copy +operations can be checked by the implementation to prevent past-the-end +writes. This is an advantage the proposed copy operation has over the +existing operations in the standard.

+

2 Design

+

The main design direction of this proposal is to provide methods for +copying and filling mdspans that may have differing layouts +and accessors, while allowing implementations to provide efficient +implementations for special cases. For example, if a copy occurs between +two mdspans with the same layout mapping type that is +contiguous and both use default_accessor, the intention is +that this could be implemented by a single memcpy.

+

Furthermore, accessors as a customization point should be enabled, as +with any other mdspan operation. For example, a custom +accessor that checks a condition inside of the access +method should still work and check that condition. It’s worth noting +that there may be a high sensitivity of how much implementations able to +optimize if provided custom accessors. For example, optimizations could +be disabled if using a custom accessor that is identical to the default +accessor.

+

Finally, there is some question as to whether copy and +fill should return a value when applied to +mdspan, as the iterator and ranged-based algorithms do. We +believe that mdspan copy and fill should return void, as +there is no past-the-end iterator that they could reasonably return.

+ +

Currently, we are proposing adding copy and +fill algorithms on mdspan to header +<mdspan>. We considered other options, namely:

+ +

We settled on <mdspan> because as proposed this is +a relatively light-weight addition that reflects operations that are +commonly desired with mdspans. However, the authors are +open to changing this.

+

2.2 Existing copy in +std::linalg

+

[P1673R13] introduced several linear +algebra operations including std::linalg::copy. This +operation only applies to mdspans with rank ≤ 2. +This paper is proposing a version of copy that is +constrained to a superset of std::linalg::copy.

+

Right now the strict addition of copy would potentially +cause the following code to be ambiguous, due to ADL-finding +std::copy:

+
using std::linalg::copy;
+copy(mds1, mds2);
+

One possibility would be to remove std::linalg::copy, as +it is a subset of the proposed std::copy, though as of now +this paper does not propose to do this.

+

2.3 What the proposal does not +include

+ +

3 Wording

+
template<class SrcElementType, class SrcExtents, class SrcLayoutPolicy, class SrcAccessorPolicy,
+         class DstElementType, class DstExtents, class DstLayoutPolicy, class DstAccessorPolicy>
+void copy(mdspan<SrcElementType, SrcExtents, SrcLayoutPolicy, SrcAccessorPolicy> src, 
+          mdspan<DstElementType, DstExtents, DstLayoutPolicy, DstAccessorPolicy> dst);
+
+template<class ExecutionPolicy, class SrcElementType, class SrcExtents, class SrcLayoutPolicy, class SrcAccessorPolicy,
+         class DstElementType, class DstExtents, class DstLayoutPolicy, class DstAccessorPolicy>
+void copy(ExecutionPolicy&& policy, mdspan<SrcElementType, SrcExtents, SrcLayoutPolicy, SrcAccessorPolicy> src,
+          mdspan<DstElementType, DstExtents, DstLayoutPolicy, DstAccessorPolicy> dst);
+

1 +Constraints:

+ +

2 +Preconditions:

+ +

3 +Effects: for all unique multidimensional indices +i... in src.extents(), assigns +src[i...] to dst[i...]

+
template<class ElementType, class Extents, class LayoutPolicy, class AccessorPolicy, class T>
+void fill(mdspan<ElementType, Extents, LayoutPolicy, AccessorPolicy> dst, const T& value);
+
+template<class ExecutionPolicy, class ElementType, class Extents, class LayoutPolicy, class AccessorPolicy, class T>
+void fill(ExecutionPolicy&& policy, mdspan<ElementType, Extents, LayoutPolicy, AccessorPolicy> dst, const T& value);
+

4 +Constraints: +std::is_assignable_v<typename mdspan<ElementType, Extents, LayoutPolicy, AccessorPolicy>::reference, const T &T>

+

5 +Preconditions: dst.is_unique()

+

6 +Effects: for all unique multidimensional indices +i... in dst.extents(), assigns +value to dst[i...]

+

4 References

+
+
+
[P0009R18]
Christian Trott, D.S. Hollman, Damien +Lebrun-Grandie, Mark Hoemmen, Daniel Sunderland, H. Carter Edwards, +Bryce Adelstein Lelbach, Mauro Bianco, Ben Sander, Athanasios +Iliopoulos, John Michopoulos, Nevin Liber. 2022. mdspan.
https://wg21.link/p0009r18
+
+
+
[P1673R13]
Mark Hoemmen, Daisy Hollman, Christian Trott, +Daniel Sunderland, Nevin Liber, Alicia Klinvex, Li-Ta Lo, Damien +Lebrun-Grandie, Graham Lopez, Peter Caday, Sarah Knepper, Piotr +Luszczek, Timothy Costa. 2023. A free function linear algebra interface +based on the BLAS.
https://wg21.link/p1673r1
+
+
+
[P1684R5]
Christian Trott, Daisy Hollman, Mark Hoemmen, +Daniel Sunderland, Damien Lebrun-Grandie. 2023. mdarray: An Owning +Multidimensional Array Analog of mdspan.
https://wg21.link/p1684r5
+
+
+
[P2630R4]
Christian Trott, Damien Lebrun-Grandie, Mark +Hoemmen, Nevin Liber. 2023. Submdspan.
https://wg21.link/p2630r4
+
+
+
+
+ + diff --git a/mdspan_copy/mdspan_copy.md b/mdspan_copy/mdspan_copy.md index a9740744..b89d289c 100644 --- a/mdspan_copy/mdspan_copy.md +++ b/mdspan_copy/mdspan_copy.md @@ -1,15 +1,24 @@ --- title: "Copy and fill for `mdspan`" date: today +author: + - name: Nicolas Morales + email: + - name: Christian Trott + email: + - name: Mark Hoemmen + email: + - name: Damien Lebrun-Grandie + email: --- # Motivation -C++23 introduced `mdspan` ([@P0009R18]), a nonowning multidmensional array abstraction that has a customizable layout. Layout customization was originally motivated in [@P0009R18] with considerations for interoperability and performance, particularly on different architectures. Moreover, [@P2630R4] introduced `submdspan`, a slicing function that can yield arbitrarily strided layouts. Without standard library support, copying efficiently between mdspans with mixes of complex layouts is challenging for users. +C++23 introduced `mdspan` ([@P0009R18]), a nonowning multidmensional array abstraction that has a customizable layout. Layout customization was originally motivated in [@P0009R18] with considerations for interoperability and performance, particularly on different architectures. Moreover, [@P2630R4] introduced `submdspan`, a slicing function that can yield arbitrarily strided layouts. However, without standard library support, copying efficiently between mdspans with mixes of complex layouts is challenging for users. -Many applications, including high-performance computing (HPC), image processing, computer graphics, etc that benefit from `mdspan` also would benefit from basic memory operations provided in standard algorithms such as copy and fill. Indeed, the authors found that a copy algorithm would have been quite useful in their implementation of the copying `mdarray` ([@P1684]) constructor. +Many applications, including high-performance computing (HPC), image processing, computer graphics, etc that benefit from `mdspan` also would benefit from basic memory operations provided in standard algorithms such as copy and fill. Indeed, the authors found that a copy algorithm would have been quite useful in their implementation of the copying `mdarray` ([@P1684R5]) constructor. A more constrained form of `copy` is also included in the standard linear algebra library ([@P1673R13]). -However, existing standard library facilities are not sufficient here. Currently, `mdspan` does not have iterators or ranges that represent the span of the `mdspan`. Additionally, it's not entirely clear what this would entail. +However, existing standard library facilities are not sufficient here. Currently, `mdspan` does not have iterators or ranges that represent the span of the `mdspan`. Additionally, it's not entirely clear what this would entail. `std::linalg::copy` ([@P1673R13]) is limited to `mdspans` of rank 2 or lower. Moreover, the manner in which an `mdspan` is copied (or filled) is highly performance sensitive, particularly in regards to caching behavior when traversing mdspan memory. A naïve user implementation is easy to get wrong in addition to being tedious for higher rank `mdspan`s. Ideally, an implementation would be free to use information about the layout of the `mdspan` known at compile time to perform optimizations; e.g. a continuous span `mdspan` copy for trivial types could be implementeed with a `memcpy`. @@ -27,11 +36,33 @@ Furthermore, accessors as a customization point should be enabled, as with any o Finally, there is some question as to whether `copy` and `fill` should return a value when applied to `mdspan`, as the iterator and ranged-based algorithms do. We believe that `mdspan` copy and fill should return void, as there is no past-the-end iterator that they could reasonably return. +## Header + +Currently, we are proposing adding `copy` and `fill` algorithms on `mdspan` to header ``. We considered other options, namely: + +* ``: This would mean that users of iterator-based algorithms would need to pull in ``. On the other hand, this is where iterator-based `copy` and `fill` live so may be preferable in that sense. +* `` (or similarly any other new header): This seems like overkill for two functions. However, in the future, we may want to add new algorithms for `mdspan` that are not strictly covered by existing algorithms in ``, so this option may be more future proof. + +We settled on `` because as proposed this is a relatively light-weight addition that reflects operations that are commonly desired with `mdspan`s. However, the authors are open to changing this. + +## Existing `copy` in `std::linalg` + +[@P1673R13] introduced several linear algebra operations including `std::linalg::copy`. This operation only applies to `mdspan`s with $rank \le 2$. This paper is proposing a version of `copy` that is constrained to a superset of `std::linalg::copy`. + +Right now the strict addition of `copy` would potentially cause the following code to be ambiguous, due to ADL-finding `std::copy`: + +```c++ +using std::linalg::copy; +copy(mds1, mds2); +``` + +One possibility would be to remove `std::linalg::copy`, as it is a subset of the proposed `std::copy`, though as of now this paper does not propose to do this. + ## What the proposal does not include * `std::move`: Perhaps this should be included for completeness's sake. However, it doesn't seem applicable to the typical usage of `mdspan`. * `(copy|fill)_n`: As a multidimensional view `mdspan` does not in general follow a specific ordering. Memory ordering may not be obvious to calling code, so it's not even clear how these would work. Any applications intending to copy a subset of `mdspan` should use call `copy` on the result of `submdspan`. -* `copy_backward`: As above, there is no specific ordering. A similar effect could be achieved via transformations with a custom layout, similar to `layout_transpose` in [@P1673]. +* `copy_backward`: As above, there is no specific ordering. A similar effect could be achieved via transformations with a custom layout, similar to `layout_transpose` in [@P1673R13]. * Other algorithms, include `std::for_each`. `for_each` in particular is a desirable but brings in many unanswered questions that should be addressed in a different paper. # Wording @@ -39,11 +70,13 @@ Finally, there is some question as to whether `copy` and `fill` should return a ```c++ template -void copy(mdspan src, mdspan dst); +void copy(mdspan src, + mdspan dst); template -void copy(ExecutionPolicy&& policy, mdspan src, mdspan dst); +void copy(ExecutionPolicy&& policy, mdspan src, + mdspan dst); ``` [1]{.pnum} *Constraints:*