Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dask support #17

Closed
asmeurer opened this issue Feb 22, 2023 · 10 comments
Closed

Dask support #17

asmeurer opened this issue Feb 22, 2023 · 10 comments

Comments

@asmeurer
Copy link
Member

Should we add Dask as a supported library that we wrap? See dask/dask#8750 (comment). CC @tomwhite @jakirkham

@rgommers
Copy link
Member

That probably makes sense to do. And in principle I think that this package can add support for any widely used array library.

@tomwhite
Copy link

I think this is a good idea. The code in dask/dask#8750 could be used as a guide - happy to help.

@asmeurer
Copy link
Member Author

I probably won't prioritize work on this myself for now unless some downstream consumer library expresses a desire for Dask support here. But if you want to submit a pull request adding it I will be happy to review and merge it.

The primary purpose of this library is to wrap existing libraries so that they more closely match the array API. This is preferable to using a separate namespace in the library itself because end-users can continue to use the existing array objects, and only the corresponding consumer library (like scipy or scikit-learn) would need to use this compat layer to make use of the uniform array API on it.

@ogrisel
Copy link

ogrisel commented Jul 3, 2023

For the record, as discussed in other PRs, I started experimenting with what would imply supporting dask in scikit-learn via the Array API spec.

array-api-compat support for dask would indeed simplify those experiments and make it possible for early adopters to try it before numpy's and dask's main APIs are fully aligned with the Array API spec by default.

See the discussion in: scikit-learn/scikit-learn#26724

@asmeurer
Copy link
Member Author

asmeurer commented Jul 5, 2023

BTW I'll be running a sprint on the array API next week at the SciPy conference sprints. If any of you will be present this could be a thing to sprint on.

@ogrisel
Copy link

ogrisel commented Sep 11, 2023

@asmeurer let me know if you give it a shot. It should be quite easy to adapt scikit-learn tests to also add dask.array to the the list of namespaces to test against:

https://github.com/scikit-learn/scikit-learn/blob/83a8015702213b39510a0f4898bc6879bcdf3ac2/sklearn/utils/_array_api.py#L13-L50

Then it's a matter of running:

pytest -k array_api sklearn/

@asmeurer
Copy link
Member Author

I'm not sure when I'll be able to work on this issue, but if someone else wants to work on implementing it I'll be happy to review/help with questions.

@jakirkham
Copy link
Member

cc @jrbourbeau (for awareness)

@rgommers
Copy link
Member

This is moving now, xref gh-76.

@asmeurer
Copy link
Member Author

asmeurer commented Mar 8, 2024

Dask support has been merged. Quite a lot of functionality is skipped in the tests, which maybe could be improved with further wrappers, though most of it will either need to be fixed upstream, or can never be supported in Dask. People should open new issues if they run into any missing or broken behavior with Dask.

@asmeurer asmeurer closed this as completed Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants