You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There appears to be a bug somewhere in zarr (and I would imagine h5py based on the fact that the loading speed is similar) with regards to loading ragged arrays.
Ideally some chunk would be compressed once and then loaded once on each save/load cycle. What is actually happening is different for .zspy and .hspy.
For .hspy each index in the ragged array is compressed individually and then uncompressed individually. This isn't efficient at all but isn't the worst case scenario.
For .zspy multiple indexes in the ragged array are compressed at one time but it seems like only 1 index is being uncompressed at a time. The result is that as you increase the number of indexes compressed (n) the time to uncompress is multiplied by (n).
Describe the bug
There appears to be a bug somewhere in
zarr
(and I would imagineh5py
based on the fact that the loading speed is similar) with regards to loading ragged arrays.To Reproduce
Steps to reproduce the behavior:
Expected behavior
Ideally some chunk would be compressed once and then loaded once on each save/load cycle. What is actually happening is different for
.zspy
and.hspy
.For
.hspy
each index in the ragged array is compressed individually and then uncompressed individually. This isn't efficient at all but isn't the worst case scenario.For
.zspy
multiple indexes in the ragged array are compressed at one time but it seems like only 1 index is being uncompressed at a time. The result is that as you increase the number of indexes compressed (n) the time to uncompress is multiplied by (n).Python environement:
Additional context
See #164 for more context
The text was updated successfully, but these errors were encountered: