-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jzarr read/write (close #36) #37
Conversation
String dsname = args[2]; | ||
ZarrArray verification = ZarrGroup.open(fpath).openArray(dsname); | ||
int[] shape = verification.getShape(); | ||
if (!Arrays.equals(new int[]{}, shape)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I am reading this wrong, but why does it look like this line is comparing a new empty array to shape? I was expecting Array.equals(SHAPE, shape)
? (I am not a Java dev)
Am I correct that when called with arguments, this program only checks the shape, but not the actual image contents?
What do you think about writing out a NumPy .npy
or .npz
file from Java? (A quick search indicates there is at least one library available for this (GitHub). That is what the xtensor-zarr executable does (using an .npz writer that was already available in xtensor-io
). The array comparison to the reference can then done in Python. Or maybe it is just easier to do the verification in java itself instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I am reading this wrong, but why does it look like this line is comparing a new empty array to shape? I was expecting
Array.equals(SHAPE, shape)
? (I am not a Java dev)
Argh. No, A) you're completely right. I added this before the troubles with <u1
began in order to force a failure so B) why isn't it forcing a failure?!
What do you think about writing out a NumPy
.npy
or.npz
file from Java?
I'm personally skeptical. I think we're going to want to increase the amount of verification done in each language and it seems like that will hit limits via NumPy. But I assume I'm going to lose this one to you and @constantinpape 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see: 1ad9af0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm personally skeptical. I think we're going to want to increase the amount of verification done in each language and it seems like that will hit limits via NumPy. But I assume I'm going to lose this one to you and @constantinpape slightly_smiling_face
I am pretty agnostic on this one. I think it's fine if you validate in java. But we should somehow return the correct error codes to the python test to populate the test summary. (I haven't checked the rest of this PR yet, so maybe you are doing that already ;)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why isn't it forcing a failure?!
I opened joshmoore#1 with a simplified way of calling the subprocess that I am sure is actually running it (it takes a few seconds per test case). That one will fail if the exit code is not zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also do not have a problem with just doing the validation in Java
use subprocess.check_output to call jzarr test script
Co-authored-by: Gregory R. Lee <[email protected]>
Looking at one of these failures (reading cc: @SabineEmbacher |
Yeah, the values seen there are what would be obtained when trying to convert For example import numpy as np
np.asarray([154, 147, 151, 109], dtype=np.int8) gives
|
JZarr is bound to the primitive data types of Java. There is no primitive unsigned byte data type in Java. Here is an example of interpreting byte data as unsigned bytes. org.esa.snap.core.datamodel.ProductData.UByte The same overflow effect will be observed with the u2 and u4 data types. |
Thanks for the explanation, @SabineEmbacher. I clearly misunderstood https://jzarr.readthedocs.io/en/latest/datatype.html?highlight=unsigned#data-types |
Failures now look to all be related to nesting: 9 failures
@grlee77: don't know if you want to hold off on this PR until then, or introduce a way to exclude/skip/tolerate that part of the matrix. |
I am fine with waiting, but if you would rather merge sooner we could introduce an environment variable to enable skipping the known failures. |
Do you have a suggestion how I could write this in the documentation to avoid misinterpretation? |
Hey @SabineEmbacher. I guess the question is if you see a way forward here. Or is this simply an incompatibility that will remain between zarr-python and jzarr? |
Whew. 💚 @grlee77, turning this over to you. I imagine we can continue the discussion around signedness (#37 (comment)) elsewhere. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @joshmoore, this looks good to me now.
The only real issue (repeated from #25) is that I don't see a way to return an array from the
read_from_jzarr
method which means that non-Python implementations may need to handle the verification code themselves internally.cc: @constantinpape