This project will no longer be maintained by Intel.
Intel has ceased development and contributions including, but not limited to, maintenance, bug fixes, new releases, or updates, to this project.
Intel no longer accepts patches to this project.
If you have an ongoing need to use this project, are interested in independently developing it, or would like to maintain patches for the open source software community, please create your own fork of this project.
Important
The alternative project CUTLASS will include all XeTLA features, refer: Cutlass-Fork.
Contact: [email protected]
Intel® XeTLA v0.3.7 - December 2023
Intel® Xe Templates for Linear Algebra (Intel® XeTLA) is a collection of SYCL/ESIMD templates that enable high-performance General Matrix Multiply (GEMM), Convolution (CONV), and related computations on Intel Xe GPU architecture. Intel® XeTLA offers reusable C++ templates for kernel, group and subgroup levels, allowing developers to optimize and specialize kernels based on data types, tiling policies, algorithms, fusion policies, and more.
One of the key features of Intel® XeTLA is its ability to abstract and hide details of Xe hardware implementations, particularly those related to matrix computations, such as the systolic array and other low level instructions. This ensures that SYCL/DPC++ developers can focus on leveraging the performance benefits of Intel® XeTLA without being burdened by hardware-specific instructions.
Category | Requirement | Installation |
---|---|---|
OS | Ubuntu 22.04 | Install Ubuntu |
GPU Card | Intel® Data Center GPU Max Series | N/A |
GPU Driver | Stable 736.25 or later | Install Intel GPU driver |
Toolchain | Intel® oneAPI Base Toolkit 2024.0.1 or later | Install Intel® oneAPI Base Toolkit |
- GEMM
- Data Type
- Vector-engine-based:
fp32
- Matrix-engine-based:
tf32
,fp16
,bf16
,int8
- Vector-engine-based:
- Memory Layout
- Matrix A:
row-major
,col-major
- Matrix B:
row-major
,col-major
- Matrix C:
row-major
- Matrix A:
- Data Type
- Epilogue
- Bias Add
- GELU Forward
- GELU Backward
- RELU
- Residual Add
- Quick Start introduces how to build and run tests/examples.
- Functionality describes kernel-level API feature list.
- API Reference provides a comprehensive reference of the library APIs.
- Programming Guidelines explains programming model, functionalities, implementation details, and annotated examples.
- Construct a High Performance GEMM describes how to construct a high performance GEMM.
- Terminology describes terms used in the project.
- Changelog detailed listing of releases and updates.
include/ # Definitions of Intel® XeTLA APIs
common/ # - Low level APIs that wrap the same functionality APIs from ESIMD
experimental/ # - Experimental features
group/ # - Group level APIs
kernel/ # - Kernel level APIs
subgroup/ # - Subgroup level APIs
xetla.hpp # - Unified and unique external head file
tests/ # Tests to verify correctness of Intel® XeTLA APIs
integration/ # - Integration testes
unit/ # - Unit tests
utils/ # - Utils implement of unit and integration tests
examples/ # Examples of Intel® XeTLA basic/fused kernels
tools/ # Tools for code format, build environment...
media/ # Documents
Refer to Contributing Guidelines.
Refer to Limitations.
See Intel's Security Center for information on how to report a potential security issue or vulnerability.
See also: Security Policy
Copyright (c) 2022-2023 Intel Corporation Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.