Release v1.1.0
This new release includes bug fixes, a huge set of new features (e.g. LibDevice
integration, CudaFFT
and NVML
bindings) and a significantly improved O2
optimization pipeline (get the ILGPU Nuget package and ILGPU Algorithms Nuget package).
Changes
- Bumped
System.Reflection.Metadata
from 6.0.0 to 6.0.1 (#767). - Added
NVML
bindings (#518). - Added
CuFFT
andCuFFTW
bindings (#706). - Added
NvJpeg
image-decoding bindings (#716, #721). - Added
LibDevice
bindings to include highly optimized math functions on NVIDIA GPUs (#707). - Added
FP16
support toCuBlas
bindings (#658). - Added new
alignment
methods to views to improve performance (#684). - Added new global code scheduling transformation to
O2
pipeline (#704, #734). - Improved debug view implementations of all array views (#647).
- Improved automatic vectorization (#668).
- Improved performance of dead-code elimination (#702).
- Improved loop-invariant code motion transformation (#703).
- Improved on-the-fly optimization of
SetField
operations (#671). - Improved on-the-fly optimization of
LoadElementAddress
operations (#733). - Fixed missing binding of accelerator instances during
Cuda
memcopy operations (#705). - Fixed exception handling in the case of missing assembly binding redirects (#775).
- Fixed code-placement phase and invalid removal of DebugAssert values (#749).
- Fixed race condition in
CPUMultiprocessor
during lazy initialization (#747). - Fixed inheritance to avoid removal of IOValue instances (#745).
- Fixed issue with the same phi value being reused in a loop (#756).
- Fixed issue with unique algorithm when running multiple iterations per group (#758).
- Prevented unintentional initialization of the current
Accelerator
instance (#714).
Internal changes
- Require .NET6 for building and enable package validation (#729).
- Bumped
T4.Build
from 0.2.3 to 0.2.4 (#767). - Bumped
FluentAssertions
from 6.5.0 to 6.5.1 (#748). - Bumped
Microsoft.NET.Test.SDK
from 17.0.0 to 17.1.0 (#752). - Fixed warnings in NET6 builds (#710).
- Fixed missing struct constraint on
TraversalSuccessorsProvider
(#727). - Added ILGPU logos to
logo
folder (#717).
Special thanks
Special thanks to @debiday, @jgiannuzzi, @MoFtZ and @Ruberik for their contributions to this release in form of code, feedback, ideas and proposals. Furthermore, we would like to thank the entire ILGPU community (especially @Joey9801, @kilngod, @mikhail-khalizev, @MPSQUARK, @NullandKale, @RER009 and @Yey007) for providing feedback, submitting issues and feature requests.