Skip to content

Release v0.8.1-beta1

Pre-release
Pre-release
Compare
Choose a tag to compare
@m4rs-mt m4rs-mt released this 03 Jan 02:59
· 1634 commits to master since this release

The new beta version offers significant performance improvements of the generated kernel programs.

  • Improved compile-time performance by up to 4X (#110).
  • Reduced memory footprint by up to 3X (#109, #118).
  • Added new optimization level O2 to enable expensive and aggressive optimizations (#70, #110, #111, #121).
  • No compiler release builds in Nuget package to improve runtime performance (#130).
  • Added new IR verifier that can be enabled via ContextFlags.EnableVerifier (#121).
  • Added generation of vectorized instructions to PTX backend (#111).
  • Fixed critical code-generation issue on Unix platforms (#116).
  • Added dynamic shared memory support for all platforms (#97, #98).
  • Added new KernelInfo objects to kernel loaders in order to query detailed kernel statistics (e.g. amount of local memory in bytes) (#104).