From 9922dd2cc1b3d5c0dd0937aff3732f53c2843fdc Mon Sep 17 00:00:00 2001 From: Jeff Hammond Date: Tue, 10 May 2022 19:20:39 +0300 Subject: [PATCH 1/3] fix infintely infintely is spelled wrong but also wrong in meaning. there are not infinitely more possibilities for tensors, merely combinatorially many more. --- D1673/P1673.bs | 2 +- D1673/blas_interface.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/D1673/P1673.bs b/D1673/P1673.bs index 8c98c9ee..c0aa2cea 100644 --- a/D1673/P1673.bs +++ b/D1673/P1673.bs @@ -1266,7 +1266,7 @@ algebra libraries like the BLAS, so a linear algebra library is a good first step. Second, `mdspan` has natural use as a low-level representation of dense tensors, so we are already partway there. Third, even simple tensor operations that naturally generalize -the BLAS have infintely many more cases than linear algebra. It's not +the BLAS have significantly more cases than linear algebra. It's not clear to us which to optimize. Fourth, even though linear algebra is a special case of tensor algebra, users of linear algebra have different interface expectations than users of tensor algebra. Thus, diff --git a/D1673/blas_interface.md b/D1673/blas_interface.md index c1675c64..87ac7e40 100644 --- a/D1673/blas_interface.md +++ b/D1673/blas_interface.md @@ -839,7 +839,7 @@ algebra libraries like the BLAS, so a linear algebra library is a good first step. Second, `mdspan` has natural use as a low-level representation of dense tensors, so we are already partway there. Third, even simple tensor operations that naturally generalize -the BLAS have infintely many more cases than linear algebra. It's not +the BLAS have significantly more cases than linear algebra. It's not clear to us which to optimize. Fourth, even though linear algebra is a special case of tensor algebra, users of linear algebra have different interface expectations than users of tensor algebra. Thus, From 81a3db1c6180ac0d0b3969c862c58f3f2709e789 Mon Sep 17 00:00:00 2001 From: Jeff Hammond Date: Tue, 10 May 2022 19:51:57 +0300 Subject: [PATCH 2/3] add comments and references on BLAS and tensors 1. tensors over BLAS is not optimal although it is common. 2. added citations related to tensors and BLAS. 3. GotoBLAS is obsolete. added OpenBLAS instead. 4. added BLIS and citation. --- D1673/blas_interface.md | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/D1673/blas_interface.md b/D1673/blas_interface.md index 87ac7e40..43104485 100644 --- a/D1673/blas_interface.md +++ b/D1673/blas_interface.md @@ -416,10 +416,11 @@ problem whose solutions could be parameterized for a variety of computer architectures. See, for example, [Goto and van de Geijn 2008](https://doi.org/10.1145/1356052.1356053). There are optimized third-party BLAS implementations for common architectures, like -[ATLAS](http://math-atlas.sourceforge.net/) and -[GotoBLAS](https://www.tacc.utexas.edu/research-development/tacc-software/gotoblas2). +[ATLAS](http://math-atlas.sourceforge.net/), +[OpenBLAS](https://github.com/xianyi/OpenBLAS), +and [BLIS](https://github.com/flame/blis). A (slow but correct) [reference implementation of the -BLAS](http://www.netlib.org/blas/#_reference_blas_version_3_8_0) +BLAS](http://www.netlib.org/blas/) exists and it has a liberal software license for easy reuse. We have experience in the exercise of wrapping a C or Fortran BLAS @@ -834,9 +835,13 @@ to some multi-indices in the Cartesian product of extents. ### Tensors We exclude tensors from this proposal, for the following reasons. -First, tensor libraries naturally build on optimized dense linear +First, tensor libraries often build on optimized dense linear algebra libraries like the BLAS, so a linear algebra library is a good -first step. Second, `mdspan` has natural use as a +first step ([Di Napoli et al.](https://arxiv.org/abs/1307.2100)), +although it is likely that a native implementation is better +([Matthews](https://arxiv.org/abs/1607.00291), + [Springer and Paolo Bientinesi](https://arxiv.org/abs/1607.00145)). +Second, `mdspan` has natural use as a low-level representation of dense tensors, so we are already partway there. Third, even simple tensor operations that naturally generalize the BLAS have significantly more cases than linear algebra. It's not @@ -1597,7 +1602,7 @@ pioneering efforts and history lessons. SLATE Working Notes, Innovative Computing Laboratory, University of Tennessee Knoxville, Feb. 2018. -* K. Goto and R. A. van de Geijn, "Anatomy of high-performance matrix +* K. Goto and R. A. van de Geijn, ["Anatomy of high-performance matrix multiplication,"](https://doi.org/10.1145/1356052.1356053), *ACM Transactions on Mathematical Software* (TOMS), Vol. 34, No. 3, May 2008. @@ -1614,6 +1619,10 @@ pioneering efforts and history lessons. * D. Vandevoorde and N. A. Josuttis, "C++ Templates: The Complete Guide," Addison-Wesley Professional, 2003. +* Field G. Van Zee and Robert A. van de Geijn, + ["BLIS: A Framework for Rapidly Instantiating BLAS Functionality,"](https://doi.org/10.1145/2764454), + *ACM Transactions on Mathematical Software* (TOMS), Vol. 41, No. 3, June 2015. + ## Wording > Text in blockquotes is not proposed wording, but rather instructions for generating proposed wording. From cad8f84f8a4c2fd4d3e42b31dbe769b86e792dc8 Mon Sep 17 00:00:00 2001 From: Jeff Hammond Date: Tue, 10 May 2022 20:16:44 +0300 Subject: [PATCH 3/3] fix comment about vendor BLAS BLAS is an API. Netlib is an implementation. the vendors are implementing the BLAS API. they may or may not use code from Netlib (in some cases we know, in others not). also note that AMD forked BLIS, which is now cited elsewhere. AMD did not create BLIS. --- D1673/blas_interface.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/D1673/blas_interface.md b/D1673/blas_interface.md index 43104485..c5fe996e 100644 --- a/D1673/blas_interface.md +++ b/D1673/blas_interface.md @@ -329,7 +329,7 @@ in support of adding linear algebra to the C++ Standard Library. and Cerebras' [Wafer Scale Engine](https://www.cerebras.net/product/#chip). Several large computer system vendors offer optimized linear algebra libraries - based on or closely resembling the BLAS; these include AMD's BLIS, + that implement the BLAS API; these include AMD's fork of BLIS, ARM's Performance Libraries, Cray's LibSci, Intel's Math Kernel Library (MKL), IBM's Engineering and Scientific Subroutine Library (ESSL), and NVIDIA's cuBLAS.