Release v0.6.0
Greatly improved ILGPU version that included significant performance and code quality improvements.
- Added support for new GeForce RTX cards.
- Added initial support for arrays in kernels.
- Added additional 3D indexing functionality to ArrayView types.
- Added automatic binding of accelerators in advanced multi-GPU scenarios.
- Tested debugging and profiling capabilities on NVIDIA GPUs.
- Released test framework to verify generated kernel code.
- Improved performance of predicates in
PTXBackend
. - Removed strict array-length restriction from allocation nodes.
- Enhanced generation of get/set field operations.
- Optimized generation of conditional branches.
- Fixed invalid generation of predicate barriers in
PTXBackend
. - Fixed invalid register allocation of string types in
PTXBackend
. - Removed explicit tracking of predecessors in phi nodes.
- Fixed invalid debug assertion in
SequencePoint
. - Fixed invalid alignment of shared-memory allocations in
PTXBackend
. - Fixed invalid shared memory configuration of Cuda kernels.
Special thanks to @MoFtZ and @mikhail-khalizev for contributing to this release.