Release v0.7.0
- Added support for .Net Standard 2.1.
- Added support for
OpenCL
-compatible GPUs (beta) - Added parallel code generation in backends to improve code-generation speed.
- Added minimum
CUDA
driver version detection. - Enabled adaptive shared-memory allocation in
CPUAccelerator
. - Added new
Utility.Select
method that can be used to create highly-efficient select instructions in favor of if branches. - Added support to access Grid and Group indices via properties.
- Added support for generic Warp intrinsics that will be automatically generated by the compiler.
- Redesigned intrinsic math functions and moved
XMath
functions to theILGPU.Algorihtms
library. Use the newIntrinsicMath
class for math functions that are supported on all platforms. - Reworked intrinsic functions to allow custom implementations of intrinsics for different backends.
- Ported project to VS2019 including all static-program analysis checks.
- Applied generate code cleanup to be compliant with the new analysis checks.
- Redesigned
AcceleratorId
functionality. - Updated
CudaMemoryBuffer
to supportMemSetToZero
using alternate streams. - Fixed retrieving version number of ILGPU assembly.
- Fixed non-deterministic generation of Phi mappings.
- Fixed invalid loading of small basic types onto the evaluation stack.
- Added utility property to
Accelerator
to resolve a launch extent with the maximum number of groups. - Fixed invalid shared-memory allocation within non-kernel functions in
PTXBackend
.
Special thanks to @MoFtZ for contributing to this release.