Release Release v0.7.0 · m4rs-mt/ILGPU

Added support for .Net Standard 2.1.
Added support for OpenCL-compatible GPUs (beta)
Added parallel code generation in backends to improve code-generation speed.
Added minimum CUDA driver version detection.
Enabled adaptive shared-memory allocation in CPUAccelerator.
Added new Utility.Select method that can be used to create highly-efficient select instructions in favor of if branches.
Added support to access Grid and Group indices via properties.
Added support for generic Warp intrinsics that will be automatically generated by the compiler.
Redesigned intrinsic math functions and moved XMath functions to the ILGPU.Algorihtms library. Use the new IntrinsicMath class for math functions that are supported on all platforms.
Reworked intrinsic functions to allow custom implementations of intrinsics for different backends.
Ported project to VS2019 including all static-program analysis checks.
Applied generate code cleanup to be compliant with the new analysis checks.
Redesigned AcceleratorId functionality.
Updated CudaMemoryBuffer to support MemSetToZero using alternate streams.
Fixed retrieving version number of ILGPU assembly.
Fixed non-deterministic generation of Phi mappings.
Fixed invalid loading of small basic types onto the evaluation stack.
Added utility property to Accelerator to resolve a launch extent with the maximum number of groups.
Fixed invalid shared-memory allocation within non-kernel functions in PTXBackend.

Special thanks to @MoFtZ for contributing to this release.

Provide feedback