-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offload matrix based interpolation #239
base: develop
Are you sure you want to change the base?
Offload matrix based interpolation #239
Conversation
Thanks @l90lpa, so far I had a first look at the I am making a note that in principle it could also just inherit from eckit::linalg::SparseMatrix, and add the device capability on top. On the other hand this way it's possible to implement it differently, and construct an eckit::linalg::SparseMatrix only when needed. |
It should be mentioned explicitly that this PR depends on first merging #237 and should then be rebased on develop. |
Perhaps this PR is adding too many ingredients at once. In that respect of multiply-add (for a new PR), should matrix_multiply_add not immediately also implement the following formula including I guess here you have That could still be a simpler overloaded API for sure. |
Regarding PR content: I'm happy to break-up the PR however you would like. I only included the complete work so that you would be able to see an overview in advance. That said, I just want to mention that the reason multiply-add is part of this work is to facilitate the GPU offload of interpolation's execute_adjoint. Regarding multiply-add: whether matrix_multiply_add should immediately implement the more broad interface is up to you (I'm happy either way). I chose not to, because I wasn't aware of a need for the additional behaviour, and so I didn't want to introduce an interface that wouldn't get used. |
I appreciate very much this "draft" PR to show the complete work! Thanks! 🙏 Ideally with the overall design in mind now this can be split several different PRs in this order:
|
That sounds like a great plan, and thanks for taking a look so far! I'll work on submitting those PR's. |
761d459
to
9a33983
Compare
I've opened this as a draft PR to get early feedback and to recognise that we might want to iterate on the design.
This PR: