From d142d30637bc0058d12532b9bc144b71e638bce3 Mon Sep 17 00:00:00 2001 From: raistefintel <113093480+raistefintel@users.noreply.github.com> Date: Fri, 15 Nov 2024 16:06:35 +0100 Subject: [PATCH 1/5] doc: matmul: updated supported data types and minor edits --- doc/primitives/matmul.md | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/doc/primitives/matmul.md b/doc/primitives/matmul.md index 174534ec88a..c61a81df5ce 100644 --- a/doc/primitives/matmul.md +++ b/doc/primitives/matmul.md @@ -67,7 +67,7 @@ argument index as specified by the following table. user must pass fully specified memory objects so that the primitive is able to perform the computations. Note that the less information about shapes or format is available at the creation stage, the less performant execution - will be. In particular, if the shape is not known at creation stage, one + will be. In particular, if the shape is not known at the creation stage, you cannot use the special format tag #dnnl::memory::format_tag::any to enable an implementation to choose the most appropriate memory format for the corresponding input or output shapes. On the other hand, run-time specified @@ -80,13 +80,13 @@ argument index as specified by the following table. invalid. 3. The broadcasting shape consistency check is not done for the dimensions with - #DNNL_RUNTIME_DIM_VAL. It is user responsibility to make sure the dimensions + #DNNL_RUNTIME_DIM_VAL. Make sure the dimensions for the tensors are valid. 4. Multiple batch dimensions and broadcasting of batch dimensions of `src` and `weights` are supported for both CPU and GPU engines. - Please check tutorials below to see #DNNL_RUNTIME_DIM_VAL support in use. + Check the tutorials below to see #DNNL_RUNTIME_DIM_VAL support in use. ### Data Types @@ -96,12 +96,13 @@ types for source, destination, weights, and bias tensors: | Source | Weights | Destination | Bias | |:-----------------|:---------------------|:---------------------------------|:----------------------------| -| f32 | f32 | f32 | f32 | -| f16 | f16, u8, s8, u4, s4 | f16, u8, s8 | f16, f32 | -| bf16 | bf16, u8, s8, u4, s4 | f32, bf16 | bf16, f32 | +| f64 | f64 | f64 | f64, f32, f16, bf16, s8, u8 | +| f32 | f32 | f32 | f32, bf16, f16, u8, s8 | +| f16 | f16, u8, s8, u4, s4 | f16, u8, s8 | f32 | +| f16 | f16, u8, s8 | f32 | f32, f16 | +| bf16 | bf16, u8, s8, u4, s4 | f32, bf16 | f32, bf16 | | f32, bf16, f16 | u8, s8 | f32, bf16, f16 | f32, bf16, f16 | | f8_e5m2, f8_e4m3 | f8_e5m2, f8_e4m3 | f32, f16, bf16, f8_e5m2, f8_e4m3 | f32, bf16, f16 | -| f8_e5m2, f8_e4m3 | f8_e5m2, f8_e4m3 | f32, f16, bf16, f8_e5m2, f8_e4m3 | f32, bf16, f16 | | u8, s8 | s8 | u8, s8, s32, f32, f16, bf16 | u8, s8, s32, f32, f16, bf16 | @@ -188,6 +189,7 @@ memory buffer that shares its shape with the destination buffer). * Runtime dimensions. * Three and higher dimensional matrices. - The layout of dropout mask has to be exactly the same as that of dst. + - f64 is only supported on Intel(R) Data Center GPU Max Series. 3. **CPU** - Configuration with int8 source data type, s8 weight data type and f16 From 35cea6cb02652379c092fe75e82fba27762591cd Mon Sep 17 00:00:00 2001 From: raistefintel <113093480+raistefintel@users.noreply.github.com> Date: Fri, 15 Nov 2024 16:07:47 +0100 Subject: [PATCH 2/5] doc: data types: added matmul f64 support --- doc/programming_model/data_types.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/programming_model/data_types.md b/doc/programming_model/data_types.md index e0407cc5c86..bf44b87404c 100644 --- a/doc/programming_model/data_types.md +++ b/doc/programming_model/data_types.md @@ -43,7 +43,7 @@ oneDNN supports training and inference with the following data types: model implementation. @note - f64 is only supported for convolution, reorder, layer normalization and + f64 is only supported for matmul, convolution, reorder, layer normalization and pooling primitives, on the GPU engine. @note From a5d23833a1bff9f314ab52bc6455116a1e19647c Mon Sep 17 00:00:00 2001 From: raistefintel <113093480+raistefintel@users.noreply.github.com> Date: Fri, 15 Nov 2024 16:23:07 +0100 Subject: [PATCH 3/5] doc: matmul: minor edit --- doc/primitives/matmul.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/primitives/matmul.md b/doc/primitives/matmul.md index c61a81df5ce..dca6718cd89 100644 --- a/doc/primitives/matmul.md +++ b/doc/primitives/matmul.md @@ -189,7 +189,7 @@ memory buffer that shares its shape with the destination buffer). * Runtime dimensions. * Three and higher dimensional matrices. - The layout of dropout mask has to be exactly the same as that of dst. - - f64 is only supported on Intel(R) Data Center GPU Max Series. + 3. **CPU** - Configuration with int8 source data type, s8 weight data type and f16 From 6a1c93e4a2132b4751c688eb9b7af31a3c6ce5ca Mon Sep 17 00:00:00 2001 From: raistefintel <113093480+raistefintel@users.noreply.github.com> Date: Wed, 20 Nov 2024 08:42:28 +0100 Subject: [PATCH 4/5] doc: matmul: minor edit Co-authored-by: Ranu Kundu --- doc/programming_model/data_types.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/programming_model/data_types.md b/doc/programming_model/data_types.md index bf44b87404c..d7a2c872018 100644 --- a/doc/programming_model/data_types.md +++ b/doc/programming_model/data_types.md @@ -44,7 +44,7 @@ oneDNN supports training and inference with the following data types: @note f64 is only supported for matmul, convolution, reorder, layer normalization and - pooling primitives, on the GPU engine. + pooling primitives on the GPU engine. @note Boolean is only supported by the oneDNN graph API when the graph compiler From 69de37422a3229185911bc94470dffc901521655 Mon Sep 17 00:00:00 2001 From: raistefintel <113093480+raistefintel@users.noreply.github.com> Date: Wed, 20 Nov 2024 08:43:06 +0100 Subject: [PATCH 5/5] doc: matmul: minor Co-authored-by: Ranu Kundu --- doc/programming_model/data_types.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/programming_model/data_types.md b/doc/programming_model/data_types.md index d7a2c872018..487f579f123 100644 --- a/doc/programming_model/data_types.md +++ b/doc/programming_model/data_types.md @@ -43,7 +43,7 @@ oneDNN supports training and inference with the following data types: model implementation. @note - f64 is only supported for matmul, convolution, reorder, layer normalization and + f64 is supported only for matmul, convolution, reorder, layer normalization, and pooling primitives on the GPU engine. @note