This release primarily includes improvements for AMD GPUs
Changelog:
- Communication streams have been moved to CommunicationBase
- Added launch bounds for HIP kernel
- Improved RHMC for AMD GPUs
- Manual loop unrolling in Dslash for AMD GPUs
- Updated profiling applications
- added more profiling applications for development:
- axpy
- triad
- 7linkprof
- cgprofiling