Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is the fillval = 'INDEF' parameter in resample_spec working as intended? #7664

Closed
stscijgbot-jp opened this issue Jun 28, 2023 · 21 comments · Fixed by #8488
Closed

Is the fillval = 'INDEF' parameter in resample_spec working as intended? #7664

stscijgbot-jp opened this issue Jun 28, 2023 · 21 comments · Fixed by #8488

Comments

@stscijgbot-jp
Copy link
Collaborator

Issue JP-3281 was created on JIRA by Kayli Glidic:

Hi, 

I wanted to bring attention to a potential issue I encountered regarding one of the default parameters used in the resample_spec step of Stage 2. Upon analyzing NIRSpec Fixed Slit data, specifically S1600A1, which is observing background sky, I noticed a problem with the resulting data in the cal.fits file (from stage 2) vs. the resulting data in the s2d.fits file (from stage 2). 

 

Specifically, I observed that some pixels at the edges of the slit have values of NaN in the cal.fits file. However, when I examined the resulting s2d.fits file (also from stage 2), I found that those previously NaN pixels disappeared and seem to have been replaced with zeros. This replacement of NaN with zeros can be misleading once the pipeline gets to the extract_1d step, especially for extracting a background spectrum/extended source. Ideally, for extended sources, the extraction should encompass the entire slit, which means it encloses those zero-valued pixels at the edges in the s2d.fits data. Consequently, these zero-valued pixels are then factored into the final flux calculation, which makes me question the reliability of the resulting spectrum.

I only began to question my final spectrum after looking at the NPIXELS column in the 1D table. Looking at the pipeline documentation it states: "NPIXELS is the number of pixels that were added together for the source extraction region. Note that this is not necessarily a constant, and the value is not necessarily an integer (the data type is float)."

When I looked at my NPIXELS data, the value was always constant, it didn't vary by column which means it does not exclude those pixels with no flux. Columns with zero-valued pixels are then returning lower than expected surface brightness values. 

I did some digging and found that the parameter 'fillval' in resample_spec is set to 'INDEF' by default. Based on my understanding and best assumption, this fill value should assign NaN to pixels with no flux, which does not appear to be happening. As a quick test, I ran Stage 2 using both the default 'fillval' of 'INDEF' and by explicitly setting 'fillval' to NaN using the command spec2.resample_spec.fillval = 'nan' in my Jupyter notebook. I have attached the results to this ticket, including the resampled 2D spectra and the resulting 1D spectra for both cases. When I set 'fillval' to 'nan' explicitly, the pixels in the 2D spectra appear white with no assigned value, in contrast to the default 'fillval' where the pixels have a value of zero. The difference in 1D spectra between the two cases is also noticeable upon comparison which is why I wanted to bring this potential issue up. Also, just to note the same extraction region was used to get the 1D spectra in both cases. 

I am uncertain if this parameter is used in other steps, but at least in the context of the resample_spec step, this parameter causes a difference in my final results. I came across a GitHub issue from 2018 that addressed the use of the 'INDEF' fillval, but it still appears open and I didn't come across a specific JIRA ticket about this when I searched ([https://github.com//issues/2219]). 

My concern is whether 'INDEF' is functioning as intended. Might this be a discrepancy between the expected behavior and the actual outcome when using this default?

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Howard Bushouse on JIRA:

There's a semi-vague statement in drizzle_arrays function called within the resample step that says:


        fillval: str, optional

            The value a pixel is set to in the output if the input image does

            not overlap it. The default value of INDEF does not set a value.

but of course it's not entirely clear exactly what "does not set a value" implies. Perhaps a zero-filled output array is created and then if an output pixel gets no contribution it'll stay at zero when fillval='INDEF'. Will need to have a drizzle guru take a look.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Mihai Cara on JIRA:

"fill" applies to pixels that do not have any resampled flux in them, i.e., to output SCI pixels whose corresponding weight is 0.

fillval='INDEF' means that the output image pixels without data will not be "filled" at all: whatever was already in the output array will stay the same. I looked at the code and it looks like the output array is pre-populated with 0s. So, when fillval='INDEF', pixels without signal will have 0 values.

If you would like output images to have NaN at pixels with zero weight (no flux), set fillval='NaN'.

I do not know why the default was chosen to be 'INDEF' but I really do not think this is very important if any code that uses resampled images takes into account the weight image (or context image) to check whether an output pixel contains any of the input flux. If it doesn't, IMO, spectral extraction code should not use those pixels.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Mihai Cara on JIRA:

In other words, 'drizzle' package seems to do what is expected, but the resample step in the pipeline pre-fills the output array with 0 and then it simply does not modify these values (fillval='INDEF'). Whether or not this is the intended or expected behavior - I don't know. Again, I am not sure this is a problem in the resample step but rather, I think, it is a problem in the spectral extraction code that does not take into account individual pixel's weight when performing computations (instead of checking that a pixel is NaN).

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Howard Bushouse on JIRA:

So one possible solution here would be to initialize the output resampled image to all NaN values, such that when an output pixel ends up receiving no resampled signal and "fillval=INDEF" (which tells the step to take no action with such pixels), it would retain the NaN value in the output resampled product. It should also have weight and context values of 0. Of course that means that the current zero-valued pixels showing up in resampled images would now come out with NaNs, which could affect downstream software/analysis tools. So perhaps this warrants some discussion from interested stakeholders in INS.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Mihai Cara on JIRA:

As an alternative, data that require spectral extraction should set "fillval=NaN" in the configuration file.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Howard Bushouse on JIRA:

Personally I think that having output pixels with no contribution set to zero when FILLVAL=INDEF is a bit counterintuitive. A reasonable user would expect that to happen when FILLVAL=0. Maybe we simply shouldn't allow a FILLVAL of INDEF? Or at the very least switch the default setting in resample/resample_spec to be FILLVAL=NaN. When set to INDEF, it's not obvious to me what value should be assigned to the output pixel. There's no way to have the pixel be "empty" of a value, but I guess NaN is the next closest thing to "empty".

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Howard Bushouse on JIRA:

Coming back to this again, in order to try to make progress and get this resolved. I therefore propose that:

  1. We modify the resample/resample_spec code to no longer allow FILLVAL to have a value of INDEF. An output pixel should actively be set to some requested value when it has no contribution from any input pixels, rather than doing nothing. "Doing nothing" is too vague and unpredictable.

  2. Set the default value of FILLVAL to "NaN", so that output pixels with no contribution are set to NaN.

  3. Also allow zero as a user-settable value for FILLVAL, which when selected would duplicate the current behavior that results in output pixels set to zero when they have no contribution.

Anton Koekemoer Greg Sloan  how do you want to proceed with getting a decision on this? Bring it up at a CAL WG meeting?

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Anton Koekemoer on JIRA:

tagging David Law  for further followup on the above question and adding him to the watchlist

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Thanks Anton Koekemoer for flagging this to me.  I just confirmed the above issue as well, and am inclined to agree with Howard Bushouse on the solution.  Adding to the JP agenda.

@stscijgbot-jp
Copy link
Collaborator Author

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Discussed at May 1 JP meeting ([https://outerspace.stsci.edu/pages/viewpage.action?pageId=245072675).]  Plan is to move ahead with Howard Bushouse's suggestion above, pending further input from INS teams if any modes needs param ref files to set FILLVAL=0.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Richard A. Shaw on JIRA:

Has anyone checked that the software for creating preview (jpg) images for the archive will ignore NaN values?

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Howard Bushouse on JIRA:

Richard A. Shaw Having NaNs in cal products is quite common, at least over some areas of images, so given that no one has complained yet about generating previews for them, I'm guessing the answer is that they do not cause problems.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Richard A. Shaw The previews look ok for MIRI MRS data which is already nan-bordered, so I wouldn't expect an issue for regular imaging.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Concurrence reached at the May 15 JP meeting (https://outerspace.stsci.edu/pages/viewpage.action?pageId=246974575) to changed to NaN padding; see #8488

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Howard Bushouse on JIRA:

Fixed by #8488

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Howard Bushouse on JIRA:

The fix that's been implemented has not changed or removed the behavior that results from fillval=INDEF, but instead has simply changed the default value of fillval to "NaN", so that pixels with no contribution get set to NaN, instead of zero.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Tested and working, closing.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Rachel Plesha on JIRA:

Howard Bushouse Mihai Cara I wanted to check – does this fix apply to both resample and resample_spec, or just resample_spec as the current metadata implies? My notes all say resample, but I want to verify my understanding.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Both; just fixed the metadata.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Howard Bushouse on JIRA:

Yes, it applies to both resample and resample_spec, because ResampleSpecStep uses ResampleStep as its base class and inherits all params from there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant