Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JP-3281: Change resampling defaults to NaN padding instead of INDEF #8488

Merged
merged 6 commits into from
May 28, 2024

Conversation

drlaw1558
Copy link
Collaborator

@drlaw1558 drlaw1558 commented May 15, 2024

Resolves JP-3281 by changing padding around science data in resampled arrays to NaN instead of INDEF.

Closes #7664

Checklist for PR authors (skip items if you don't have permissions or they are not applicable)

  • added entry in CHANGES.rst within the relevant release section
  • updated or added relevant tests
  • updated relevant documentation
  • added relevant milestone
  • added relevant label(s)
  • ran regression tests, post a link to the Jenkins job below.
    How to run regression tests on a PR
  • All comments are resolved
  • Make sure the JIRA ticket is resolved properly

Copy link

codecov bot commented May 15, 2024

Codecov Report

Attention: Patch coverage is 0% with 3 lines in your changes are missing coverage. Please review.

Project coverage is 57.97%. Comparing base (4179c09) to head (f0ad63b).
Report is 1 commits behind head on master.

Current head f0ad63b differs from pull request most recent head 98b7f47

Please upload reports for the commit 98b7f47 to get more accurate results.

Files Patch % Lines
jwst/resample/gwcs_drizzle.py 0.00% 1 Missing ⚠️
jwst/resample/resample.py 0.00% 1 Missing ⚠️
jwst/resample/resample_step.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #8488   +/-   ##
=======================================
  Coverage   57.97%   57.97%           
=======================================
  Files         387      387           
  Lines       38830    38830           
=======================================
  Hits        22513    22513           
  Misses      16317    16317           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@drlaw1558
Copy link
Collaborator Author

Probably failing a bunch of reg tests as outputs are now NaN-padded insted of 0-padded.

@hbushouse hbushouse added this to the Build 11.0 milestone May 20, 2024
CHANGES.rst Outdated Show resolved Hide resolved
CHANGES.rst Outdated Show resolved Hide resolved
@hbushouse
Copy link
Collaborator

@hbushouse
Copy link
Collaborator

Unit tests contained in the jwst/resample/tests/test_resample_step.py module are causing all the CI failures, due to the change from zero to NaN in output arrays. These tests will need updating.

@drlaw1558
Copy link
Collaborator Author

Unit tests contained in the jwst/resample/tests/test_resample_step.py module are causing all the CI failures, due to the change from zero to NaN in output arrays. These tests will need updating.

This seems a little tricky. The failing test test_custom_refwcs_resample_imaging is creating an array of random numbers, passing it through resample in two slightly different ways, and checking that the results are allclose. However, for whatever reason the values coming out of resample before this PR are all zeros everywhere in both resample calls. All zeroes matches all zeroes. After this change the results of both calls are two arrays of all NaNs. All NaNs apparently is not considered a match for all NaNs.

The easiest thing to do is delete the assert np.allclose(data1, data2) as it doesn't seem to be testing anything meaningful anyway. However, what was it meant to test? Calling resample on the random number array ordinarily gives meaningful results, but the test as set up is trimming to an entirely empty part of the array.

@hbushouse
Copy link
Collaborator

Unit tests contained in the jwst/resample/tests/test_resample_step.py module are causing all the CI failures, due to the change from zero to NaN in output arrays. These tests will need updating.

This seems a little tricky. The failing test test_custom_refwcs_resample_imaging is creating an array of random numbers, passing it through resample in two slightly different ways, and checking that the results are allclose. However, for whatever reason the values coming out of resample before this PR are all zeros everywhere in both resample calls. All zeroes matches all zeroes. After this change the results of both calls are two arrays of all NaNs. All NaNs apparently is not considered a match for all NaNs.

The easiest thing to do is delete the assert np.allclose(data1, data2) as it doesn't seem to be testing anything meaningful anyway. However, what was it meant to test? Calling resample on the random number array ordinarily gives meaningful results, but the test as set up is trimming to an entirely empty part of the array.

You're right - that test is just bizarre. The nircam_rate image is defined to have a size of 204x204, and then the test is resampling to an output size of more than 1000x1000? And setting the wcs reference at around 600? Guessing that the 204x204 size might've been a mistake and that they really meant 2048x2048, I tried running it with that image size and the result (using the current master branch) is still zero-filled in all the SCI, WHT, and CON arrays, which means it has no input pixels contributing to any of the output pixels anywhere in the entire image. That whole test is just screwy. I suggest adding an xfail to it, in order to allow it to fail for now, and we'll try to get someone to fix/rewrite it in the future.

@hbushouse
Copy link
Collaborator

@drlaw1558
Copy link
Collaborator Author

I suggest adding an xfail to it, in order to allow it to fail for now, and we'll try to get someone to fix/rewrite it in the future.

Done.

@hbushouse
Copy link
Collaborator

The latest regtest results seem pretty reasonable. Lots of i2d and s2d products that now have zero values changed to NaN, some of which then propagate into NaN values of x1d products (if they're spectra). In general it's 5-20% of the pixels that see such a change. There are a few cases, however, where as many as 80-90% of the pixels in an i2d or s2d have been changed from zero to NaN. This suggests we might have some fairly ratty test data in some tests, if that many pixels of the resample products have no contribution from their inputs. But mechanically it all looks reasonable.

@drlaw1558
Copy link
Collaborator Author

The latest regtest results seem pretty reasonable. Lots of i2d and s2d products that now have zero values changed to NaN, some of which then propagate into NaN values of x1d products (if they're spectra). In general it's 5-20% of the pixels that see such a change. There are a few cases, however, where as many as 80-90% of the pixels in an i2d or s2d have been changed from zero to NaN. This suggests we might have some fairly ratty test data in some tests, if that many pixels of the resample products have no contribution from their inputs. But mechanically it all looks reasonable.

@hbushouse Can you point me at a couple of examples with 80-90% NaN-valued i2d and s2d data?

@hbushouse
Copy link
Collaborator

The latest regtest results seem pretty reasonable. Lots of i2d and s2d products that now have zero values changed to NaN, some of which then propagate into NaN values of x1d products (if they're spectra). In general it's 5-20% of the pixels that see such a change. There are a few cases, however, where as many as 80-90% of the pixels in an i2d or s2d have been changed from zero to NaN. This suggests we might have some fairly ratty test data in some tests, if that many pixels of the resample products have no contribution from their inputs. But mechanically it all looks reasonable.

@hbushouse Can you point me at a couple of examples with 80-90% NaN-valued i2d and s2d data?

The two that seem to stand out are the i2d files created by the test_miri_image_stages and test_nircam_image_stages regtests. But it may be a red herring. In addition to different values in the SCI extensions, they also show lots of differences in the various WHT, CON, ERR, etc. extensions, and not all just simple zero to NaN conversions. Many of the other extensions just show differences in numerical values. I'm not sure why. So the very large fraction of different pixels in the SCI extensions is probably a combination of multiple effects (not just the zero->NaN happening here).

@drlaw1558
Copy link
Collaborator Author

@hbushouse Interesting; I took a look at the MIRI imaging case (PID 1024 Obs 1) and all of the fluxes in JFrog are about 10% fainter than in the same image in MAST. If I reprocess this data from uncal locally though (with the relevant branch for this ticket) then things largely match the MAST result, with the difference of 0s changing to NaNs in the expected places. So I agree that this PR seems ok for that data, though I'm puzzled why the regtest values are all off by 10%. Does it start from intermediate files that might be out of date or does it run from uncal?

@hbushouse
Copy link
Collaborator

@hbushouse Interesting; I took a look at the MIRI imaging case (PID 1024 Obs 1) and all of the fluxes in JFrog are about 10% fainter than in the same image in MAST. If I reprocess this data from uncal locally though (with the relevant branch for this ticket) then things largely match the MAST result, with the difference of 0s changing to NaNs in the expected places. So I agree that this PR seems ok for that data, though I'm puzzled why the regtest values are all off by 10%. Does it start from intermediate files that might be out of date or does it run from uncal?

I'm sure that's the issue. The test setup is using cached/archived rate files as input to the image2 pipeline, and then uses the cal outputs of that to feed the image3 pipeline, which is where the differences show up. So the input rate files are probably "stale" relative to updated calibrations or detector1 code.

The fact that you've independently verified that the zero->NaN conversion looks OK gives me the confidence to approve this as is.

@hbushouse hbushouse merged commit f866557 into spacetelescope:master May 28, 2024
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Is the fillval = 'INDEF' parameter in resample_spec working as intended?
2 participants