Skip to content

Release v0.6.0

Compare
Choose a tag to compare
@m4rs-mt m4rs-mt released this 03 Jan 02:45
· 2035 commits to master since this release

Greatly improved ILGPU version that included significant performance and code quality improvements.

  • Added support for new GeForce RTX cards.
  • Added initial support for arrays in kernels.
  • Added additional 3D indexing functionality to ArrayView types.
  • Added automatic binding of accelerators in advanced multi-GPU scenarios.
  • Tested debugging and profiling capabilities on NVIDIA GPUs.
  • Released test framework to verify generated kernel code.
  • Improved performance of predicates in PTXBackend.
  • Removed strict array-length restriction from allocation nodes.
  • Enhanced generation of get/set field operations.
  • Optimized generation of conditional branches.
  • Fixed invalid generation of predicate barriers in PTXBackend.
  • Fixed invalid register allocation of string types in PTXBackend.
  • Removed explicit tracking of predecessors in phi nodes.
  • Fixed invalid debug assertion in SequencePoint.
  • Fixed invalid alignment of shared-memory allocations in PTXBackend.
  • Fixed invalid shared memory configuration of Cuda kernels.

Special thanks to @MoFtZ and @mikhail-khalizev for contributing to this release.