Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(gpu): speed up packing KS for levels==1 #1875

Merged
merged 1 commit into from
Dec 23, 2024

Conversation

andrei-stoian-zama
Copy link
Contributor

@andrei-stoian-zama andrei-stoian-zama commented Dec 13, 2024

Optimize packing ks for level count == 1

  • Adds an optimized GEMM kernel
  • Adds an alternative packing keyswitch implementation using the gemm kernel
  • Implements a fast path for packing keyswitch for levels==1 using the gemm kernel
  • Adds the fast path to the integer compression GPU code and adds a test with custom parameters that use the fast path

Achieves ~4x speedup for packing keyswitch

@cla-bot cla-bot bot added the cla-signed label Dec 13, 2024
@andrei-stoian-zama andrei-stoian-zama changed the title Feat/as generalize gemm pks all params feat(gpu): speed up packing KS for levels==1 Dec 13, 2024
Copy link
Contributor

@agnesLeroy agnesLeroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hei @andrei-stoian-zama! It looks good to me overall, but I think it would be good that @pdroalves take a look since he was the one to implement the packing keyswitch on GPU.

Copy link
Contributor

@pdroalves pdroalves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heyy @andrei-stoian-zama , thanks for this! You followed very closely our conventions and the code is clean and easy to read. Very impressive.

I added a few comments and suggestions to improve it, but nothing serious. Check to see wdyt. As long as Hyperstack allow us to see the CI green we can merge it.

About benchmarks, have you measured the impact for compression?

Copy link
Contributor

@pdroalves pdroalves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. It would be good to run the tests, though. I will try to rerun then before merge.

Thank you very much @andrei-stoian-zama !

@andrei-stoian-zama andrei-stoian-zama force-pushed the feat/as_generalize_gemm_pks_all_params branch from a2ec028 to 5c122bb Compare December 20, 2024 14:12
@zama-bot zama-bot removed the approved label Dec 20, 2024
@andrei-stoian-zama andrei-stoian-zama force-pushed the feat/as_generalize_gemm_pks_all_params branch from 5c122bb to 99a4fe3 Compare December 20, 2024 14:21
@pdroalves pdroalves added 4090_test Launch test on our CI 4090 desktop 4090_bench Launch integer bench on our CI 4090 desktop labels Dec 23, 2024
@github-actions github-actions bot removed the 4090_bench Launch integer bench on our CI 4090 desktop label Dec 23, 2024
@pdroalves pdroalves merged commit 2c8f0ce into main Dec 23, 2024
165 of 173 checks passed
@pdroalves pdroalves deleted the feat/as_generalize_gemm_pks_all_params branch December 23, 2024 13:32
@github-actions github-actions bot removed the 4090_test Launch test on our CI 4090 desktop label Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants