Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offload matrix based interpolation #239

Draft
wants to merge 11 commits into
base: develop
Choose a base branch
from

Conversation

l90lpa
Copy link
Contributor

@l90lpa l90lpa commented Nov 16, 2024

I've opened this as a draft PR to get early feedback and to recognise that we might want to iterate on the design.

This PR:

  • adds a SparseMatrix class into Atlas that wraps eckit's SparseMatrix class to combine it with GPU memory management
  • adds a hicsparse backend to Atlas's sparse linear algebra interface
  • adds "multiply-add" support into Atlas's sparse linear algebra interface
  • updates interpolation to support use of the hicparse backend of Atlas's sparse linear algebra interface

@wdeconinck
Copy link
Member

Thanks @l90lpa, so far I had a first look at the atlas::linalg::SparseMatrix class only.

I am making a note that in principle it could also just inherit from eckit::linalg::SparseMatrix, and add the device capability on top. On the other hand this way it's possible to implement it differently, and construct an eckit::linalg::SparseMatrix only when needed.

@wdeconinck
Copy link
Member

It should be mentioned explicitly that this PR depends on first merging #237 and should then be rebased on develop.

@wdeconinck
Copy link
Member

Perhaps this PR is adding too many ingredients at once.
Maybe we could just focus on the gpu-offloading first and add the 'multiply-add' later?

In that respect of multiply-add (for a new PR), should matrix_multiply_add not immediately also implement the following formula including $$\alpha$$ and $$\beta$$?

$$y = \alpha A x + \beta y$$

I guess here you have $$\alpha=1$$ and $$\beta=1$$, so that:

$$y = A x + y$$

That could still be a simpler overloaded API for sure.

@l90lpa
Copy link
Contributor Author

l90lpa commented Nov 19, 2024

Perhaps this PR is adding too many ingredients at once. Maybe we could just focus on the gpu-offloading first and add the 'multiply-add' later?

In that respect of multiply-add (for a new PR), should matrix_multiply_add not immediately also implement the following formula including α and β ?
y = α A x + β y

I guess here you have α = 1 and β = 1 , so that:
y = A x + y

That could still be a simpler overloaded API for sure.

Regarding PR content: I'm happy to break-up the PR however you would like. I only included the complete work so that you would be able to see an overview in advance. That said, I just want to mention that the reason multiply-add is part of this work is to facilitate the GPU offload of interpolation's execute_adjoint.

Regarding multiply-add: whether matrix_multiply_add should immediately implement the more broad interface is up to you (I'm happy either way). I chose not to, because I wasn't aware of a need for the additional behaviour, and so I didn't want to introduce an interface that wouldn't get used.

@wdeconinck
Copy link
Member

I appreciate very much this "draft" PR to show the complete work! Thanks!  🙏
I had a deeper look now and I definitely like your approach taken so far.

Ideally with the overall design in mind now this can be split several different PRs in this order:

  1. Implementing multiply-add, possibly with the extended capability using $$\alpha$$ and $$\beta$$...
    ...and update the adjoint interpolation methods to use this. Because this routine did not exist before, @MarekWlasak implemented the adjoint as first a matrix-multiply followed by a += . Perhaps this approach should still be used for the eckit-backend codepaths, and could possibly live inside the matrix_multiply_add implementation for the eckit-backend (within atlas) itself.
  2. Implement GPU-offloading capability for the new sparse matrix
  3. Create hicsparse backend
  4. Adapt interpolation methods to enable the GPU offloading.

@l90lpa
Copy link
Contributor Author

l90lpa commented Nov 19, 2024

That sounds like a great plan, and thanks for taking a look so far! I'll work on submitting those PR's.

@l90lpa
Copy link
Contributor Author

l90lpa commented Nov 21, 2024

I've submitted PR's for the first 2 items (#240 and #241). And, I'll submitted the remaining 2 PRs as their dependencies pass review and get merged in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants