Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

friendlier error messages for missing chunk managers #9676

Open
wants to merge 25 commits into
base: main
Choose a base branch
from

Conversation

keewis
Copy link
Collaborator

@keewis keewis commented Oct 24, 2024

The current error message when trying to use a chunked-array related method without actually having a chunk manager available is:

unrecognized chunk manager dask - must be one of: []

That's pretty confusing, so this catches the case where no chunk manager is available and raises an error with guidance on how to fix that.

  • Tests added
  • User visible changes (including notable bug fixes) are documented in whats-new.rst

@max-sixty
Copy link
Collaborator

Much better!

@mathause
Copy link
Collaborator

Thanks - I had a PR on this but don't mind closing mine in favor of this one. There is also #7963 which seems related.

@keewis
Copy link
Collaborator Author

keewis commented Oct 25, 2024

wow, I don't know how I missed two open PRs that aim to do something similar in different ways. Which one do we take?

If we merge this one your PR might still be valuable since it also changes the error message if there are chunk managers but not the one that was requested.

@dcherian
Copy link
Contributor

dcherian commented Nov 7, 2024

Shall we merge?

doc/whats-new.rst Outdated Show resolved Hide resolved
@TomNicholas TomNicholas added topic-error reporting topic-chunked-arrays Managing different chunked backends, e.g. dask labels Nov 8, 2024
@TomNicholas
Copy link
Member

TomNicholas commented Nov 8, 2024

wow, I don't know how I missed two open PRs that aim to do something similar in different ways. Which one do we take?

Sorry for dropping the ball on reviewing / merging these guys 😞

Let's merge this one.

If we merge this one your PR might still be valuable since it also changes the error message if there are chunk managers but not the one that was requested.

This change would also be useful but is much less likely to come up.

Co-authored-by: Tom Nicholas <[email protected]>
@dcherian dcherian mentioned this pull request Nov 18, 2024
6 tasks
Copy link
Member

@TomNicholas TomNicholas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this could be further improved in cases like cubed being installed but dask not (when there would not be 0 chunkmanagers installed), this PR on it's own addresses the confusing error that 99% of users are encountering so should be merged asap.

@TomNicholas
Copy link
Member

If you change the other ValueError to ImportError on line 109 of parallelcompat.py the failing test should pass.

@dcherian
Copy link
Contributor

I'm planning to release on Friday US time. Would be good to wrap this up

@TomNicholas
Copy link
Member

Trying to summarize what we discussed in the meeting on Wednesday:

at least, without maintaining a list of known chunk managers and suggesting sufficiently "close" names

This would be okay, because right now there are only 3 chunkmanagers we know of, one which ships with xarray. Outside of these 3 a slightly less helpful error message is fine.

This allows our error messages to explicitly point to the correct package to install, e.g. dask/cubed-xarray/arkouda-xarray.

We questioned the wisdom of even creating the whole entrypoint system in the first place, but also said that removing it is a separable issue for later, and the priority should be to improve the error messages first.

I can't really remember what we decided about ValueError vs ImportError though...

@keewis
Copy link
Collaborator Author

keewis commented Nov 22, 2024

Thanks for the summary.

I can't really remember what we decided about ValueError vs ImportError though...

I don't think we decided anything, but it would make sense to me to raise a ImportError for all known-but-missing chunk managers, and raise a ValueError for everything else where we don't really know whether that's because of a missing package/one that fails to import or a typo

@keewis
Copy link
Collaborator Author

keewis commented Nov 22, 2024

we probably also want to prefer the more specific error over the generic n_chunkmanagers == 0 error (which would also allow passing a chunkmanager object if no chunk manager is available via the entrypoints)

@keewis
Copy link
Collaborator Author

keewis commented Nov 22, 2024

@dcherian, @TomNicholas, this should be ready for a final review (and can hopefully still make it into the release looks like I was late by a couple minutes)

@dcherian
Copy link
Contributor

we can always release more!

@keewis
Copy link
Collaborator Author

keewis commented Nov 22, 2024

the failing zarr tests were because of the automatic chunking in open_zarr (chunks="auto")

@keewis keewis changed the title more friendly error message in case no chunk manager is available friendlier error messages for missing chunk managers Nov 22, 2024
@TomNicholas
Copy link
Member

@keewis I made some very minor changes - feel free to merge this.

raise ValueError(
f"unrecognized chunk manager {manager} - must be one of: {list(chunkmanagers)}"
f"unrecognized chunk manager {manager} - must be one of: {list(available_chunkmanagers)}"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-reading this, it looks like only this error message is only describing the issue, not offering advice. What do you think about extending it?

Suggested change
f"unrecognized chunk manager {manager} - must be one of: {list(available_chunkmanagers)}"
f"unrecognized chunk manager {manager} - must be one of: {list(available_chunkmanagers)}"
" (you can extend this list by installing more packages that provide a chunk manager)."

Copy link
Member

@TomNicholas TomNicholas Nov 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. You could also change it to say "must be one of the installed chunkmanagers: {{list(available_chunkmanagers)}}" which would be a bit clearer about what the list represents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-chunked-arrays Managing different chunked backends, e.g. dask topic-error reporting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants