Release v0.8.1-beta1
Pre-release
Pre-release
The new beta version offers significant performance improvements of the generated kernel programs.
- Improved compile-time performance by up to 4X (#110).
- Reduced memory footprint by up to 3X (#109, #118).
- Added new optimization level O2 to enable expensive and aggressive optimizations (#70, #110, #111, #121).
- No compiler release builds in Nuget package to improve runtime performance (#130).
- Added new IR verifier that can be enabled via
ContextFlags.EnableVerifier
(#121). - Added generation of vectorized instructions to PTX backend (#111).
- Fixed critical code-generation issue on Unix platforms (#116).
- Added dynamic shared memory support for all platforms (#97, #98).
- Added new KernelInfo objects to kernel loaders in order to query detailed kernel statistics (e.g. amount of local memory in bytes) (#104).