-
-
Notifications
You must be signed in to change notification settings - Fork 426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new zarr release / numcodecs #868
Comments
that's funny... I was literally about to hit "submit" on a similar issue. numcodecs 0.6.4 is still not working, (this actually was just checked in the dependabot fork today 😊)... so not sure what to do about that |
actually... looks like it was just the windows builds that broke with numcodecs... so maybe something has changed in the meantime? |
ok - also if I remember zarr-developers/numcodecs#210 correctly the issue is just with pip install on our CI, so people in the wild should still be installing ok. Maybe then this can wait till tomorrow |
pip install works fine on my machine. imho the fix is to stop depending on zarr. We can use |
Ah. I just saw why we started using zarr directly: we can tell whether there is a dataset instead of an array. Guh. =) |
hmm - relevant part of the codebase is here Line 118 in e2b90e9
I agree that we should figure out a way to drop our dependency on both zarr and numcodecs, and just leverage dask, possibly with try catches to do everything, or say that additional functionality belongs in a plugin. We can still have zarr as one of our test dependencies (if that works with our CI) or just use dask functions for all our tests too. I'll think more about this tomorrow too |
so i'm still really not sure how to deal with this one. On the one hand I like the idea of not explicitly depending on zarr and numcodecs and just leveraging fileIO options here in dask, but then there is no good way to deal with the pyramid case where we want a list of dask arrays out. We could try raising an issue on dask to see if they'd ever be interested in adding the functionality we have here to their from_zarr method, but it might be strange for them to return a list of dask arrays, or a dask array where each subarray along the first axis doesn't have the same shape. I will also note that when I do pyramid loading from my laptop it worked fine, but when I did it from s3 it couldn't find the components properly. I need to look into that more, but it makes me think that we might not have a perfect solution even now. |
just played with it a little bit, and it looks like the main thing that fails when using def read_zarr_dataset(filename):
if os.path.exists(os.path.join(filename, '.zarray')):
# load zarr array
image = da.from_zarr(filename)
shape = image.shape
elif os.path.exists(os.path.join(filename, '.zgroup')):
# else load zarr all arrays inside file, useful for pyramid data
image = []
for subd in os.listdir(filename):
full_subd = os.path.join(filename, subd)
if os.path.exists(os.path.join(full_subd, '.zarray')):
image.append(da.from_zarr(full_subd))
shape = image[0].shape
else:
raise ValueError("Not a zarr dataset or group")
return image, shape (improvements that could be made to that function to provide dask with the proper |
one last thought and then I have to sign off for the night... |
This looks great! I will play with now and try and break, but I think it is the way to go |
This is really great - I don't think we need to use the dask component syntax. I had to add a I'll add here that this solution won't now work for things like s3 links, but that didn't really work properly before either, so not sure we're in a worse place, and we can make a separate push around that remote stuff later |
🐛 Bug
Looks like zarr's new release requires numcodecs > 0.6.4, but we pinned to exclude it see discussion #666. I think we need to resolve this ASAP and then make the 0.2.10 release (which also includes the #866 bug fix). Thoughts @tlambert03 @jni? Has the 0.6.4 numcodecs install problem been resolved? You can see our failing tests in #867.
The text was updated successfully, but these errors were encountered: