Maximise the usage of dynamic shared memory for GaussianSmoothingKernel #519

ptim0626 · 2023-12-05T18:11:02Z

This PR maximises the usage of available dynamic shared memory for a particular GPU device by explicitly opt-in for the GaussianSmoothingKernel. Both pycuda and CuPy version are modified, and a flag is added to the respective load_kernel function to support this behaviour. The modification of load_kernel allows future kernels to utilise the full shared memory capacity, should the need arises.

The benefit of this is that GaussianSmoothingKernel is able to support a larger smooth_gradient, as the smoothing radius is limited by the 'halo' in the CUDA kernel which uses dynamic shared memory.

However, this does not completely solve #497 as it still limits smooth_gradient, but it does allow a larger value if your GPU has a shared memory capacity greater than 48 kB. Smoothing in the reciprocal space #504 will remove this limit.

…SmoothKernel

…anSmoothKernel

daurer · 2024-02-29T10:32:46Z

After internal discussion, we decided to not modify the setting for using shared memory but rather use fft-based smoothing across all engines #504

ptim0626 added 2 commits December 5, 2023 16:24

Use explicit opt-in of max shared memory for CuPy version of Gaussian…

9cb6a0e

…SmoothKernel

Use explicit opt-in of max shared memory for pycuda version of Gaussi…

5288401

…anSmoothKernel

daurer requested a review from bjoernenders December 15, 2023 16:50

daurer closed this Feb 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Maximise the usage of dynamic shared memory for GaussianSmoothingKernel #519

Maximise the usage of dynamic shared memory for GaussianSmoothingKernel #519

ptim0626 commented Dec 5, 2023

daurer commented Feb 29, 2024

Maximise the usage of dynamic shared memory for GaussianSmoothingKernel #519

Maximise the usage of dynamic shared memory for GaussianSmoothingKernel #519

Conversation

ptim0626 commented Dec 5, 2023

daurer commented Feb 29, 2024