Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maximise the usage of dynamic shared memory for GaussianSmoothingKernel #519

Closed
wants to merge 2 commits into from

Conversation

ptim0626
Copy link
Contributor

@ptim0626 ptim0626 commented Dec 5, 2023

This PR maximises the usage of available dynamic shared memory for a particular GPU device by explicitly opt-in for the GaussianSmoothingKernel. Both pycuda and CuPy version are modified, and a flag is added to the respective load_kernel function to support this behaviour. The modification of load_kernel allows future kernels to utilise the full shared memory capacity, should the need arises.

The benefit of this is that GaussianSmoothingKernel is able to support a larger smooth_gradient, as the smoothing radius is limited by the 'halo' in the CUDA kernel which uses dynamic shared memory.

However, this does not completely solve #497 as it still limits smooth_gradient, but it does allow a larger value if your GPU has a shared memory capacity greater than 48 kB. Smoothing in the reciprocal space #504 will remove this limit.

@daurer daurer requested a review from bjoernenders December 15, 2023 16:50
@daurer
Copy link
Contributor

daurer commented Feb 29, 2024

After internal discussion, we decided to not modify the setting for using shared memory but rather use fft-based smoothing across all engines #504

@daurer daurer closed this Feb 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants