This repository has been archived by the owner on Dec 18, 2024. It is now read-only.
XeTLA v0.3.6
v0.3.6
- Added GEMM new feature for any shapes support (odd shapes).
- Provided default configurations for GEMM API (users could get good performance by default configurations, only advanced users need to tune optimization options).
- Supported converting register layout between tiled and linear.
- Provided flexible large shape's APIs for other policy (e.g. splitk, improved mat_A & mat_B cache hit ratio).
- Refined mem_desc_t and payload_t to expose alignment parameter.
- Enabled epilogue to support D = alpha * A * B + beta * C.
- Replaced xetla_exec_item with sycl::nd_item.
- Refined some examples to invoke kernel level APIs, added fence and barrier to MLP example.
- Fixed some known issues, enhanced tests, and updated documents.