Add common resample code to stcal #279

mcara · 2024-08-08T13:37:28Z

This PR add the common resample code used by both JWST and Roman pipelines to stcal. Also, for the first time, this PR adopts the new drizzle API from spacetelescope/drizzle#134 for the resample code used in the pipelines.

This work is related to https://jira.stsci.edu/browse/AL-835

At this moment this is a very rough draft for illustration purpose. It should run with default arguments (except input_models and output file name can be specified; everything else is not guaranteed to work). There are no unit/regression tests and documentation may not match the code.

Checklist

added entry in CHANGES.rst (either in Bug Fixes or Changes to API)
updated relevant tests
updated relevant documentation
updated relevant milestone(s)
added relevant label(s)

codecov · 2024-08-08T13:42:20Z

Codecov Report

Attention: Patch coverage is 92.53996% with 42 lines in your changes missing coverage. Please review.

Project coverage is 33.42%. Comparing base (60bd3b8) to head (55294b0).
Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
src/stcal/resample/resample.py	92.48%	42 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main     #279       +/-   ##
===========================================
- Coverage   86.21%   33.42%   -52.80%     
===========================================
  Files          47       39        -8     
  Lines        8812     8818        +6     
===========================================
- Hits         7597     2947     -4650     
- Misses       1215     5871     +4656

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

kmacdonald-stsci · 2024-08-08T16:16:34Z

src/stcal/resample/resample.py

+
+    @abc.abstractmethod
+    def run(self):
+        ...


Instead of using ellipses, would it be better to raise and exception with a message saying "method not implemented"?

by marking it as an abstractmethod this is done automatically if you try to call it directly

kmacdonald-stsci · 2024-08-08T16:27:32Z

src/stcal/resample/utils.py

Is this file necessary? It is imported by only one other file. It is not a set of common functions shared among many files. It's two short functions.

braingram · 2024-08-23T14:18:20Z

Would you rebase this and add a drizzle dependency with the branch for spacetelescope/drizzle#134 so we can see the CI run?

braingram · 2024-08-23T15:01:37Z

src/stcal/resample/utils.py

@@ -0,0 +1,36 @@
+import numpy as np
+from stdatamodels.dqflags import interpret_bit_flags


Suggested change

from stdatamodels.dqflags import interpret_bit_flags

from stdatamodels.dqflags import interpret_bit_flags

Is there a non-stdatamodels way to handle this. We can't make stdatamodels a dependency of this code.

yes, there is. I'll fix it (I forgot to switch to astropy although stdatamodels.dqflags.interpret_bit_flags() for unknown to me reasons does things slightly different. I think that function should be removed altogether from stdatamodels.dqflags.

oh, I did it in resample.py but not in utils.

braingram · 2024-08-23T15:10:09Z

src/stcal/resample/resample.py

+
+        self.final_post_processing()
+
+        self._output_model.write(self._output_filename, overwrite=True)


This won't work for roman since write does not exist for roman_datamodels.DataModel and overwrite is not a valid keyword argument for roman_datamodels.DataModel.save.

I think it's best if we leave all file-IO up to the pipeline and not include it here in stcal.

that was an omission. I'll fix this.

emolter

my main question is still the extent to which we assume the structure and attributes of datamodels, and now also of ModelLibrary

emolter · 2024-08-23T15:15:12Z

src/stcal/resample/resample.py

+    """Raised when the output is too large for in-memory instantiation"""
+
+
+def output_wcs_from_input_wcs(input_wcs_list, pixel_scale_ratio=1.0,


I might put this somewhere other than resample.py, e.g. alignment/utils or alignment/resample_utils. I would vote (eventually) for having a separate submodule for WCS-related utility functions, but that's probably beyond the PR scope

Agree. It was temporary here. This will be fixed once we have a function that computes output WCS from s_region of input data models and other parameters.

emolter · 2024-08-23T15:25:03Z

src/stcal/resample/resample.py

+
+        # loop over only science exposures in the ModelLibrary
+        # sci_indices = self._input_models.ind_asn_type("science")
+        with self._input_models:


We seem to be in the same infinite loop as before here, about whether or not to make certain datamodels dependencies of stcal, either implicitly or explicitly. It seems to me that with this and other code lines, ResampleBase can only be used if the input models are assumed to be ModelLibrary instances. Are we okay with this? If so, should we just make this explicit in some way by specifying the input model type inside the ResampleBase abstract base class, and make stpipe a dependency?

We seem to be in the same infinite loop as before here, about whether or not to make certain datamodels dependencies of stcal, either implicitly or explicitly. It seems to me that with this and other code lines, ResampleBase can only be used if the input models are assumed to be ModelLibrary instances.

True. It is difficult to maximize the common code ported to stcal with the purpose of reducing maintenance burden in the pipelines given the rigid usage imposed by the ModelLibrary (by contrast, ModelContainer was behaving like a standard list). I can try to see how to bury ModelLibrary into the "IO" class, but it's not going to be pretty.

Are we okay with this? If so, should we just make this explicit in some way by specifying the input model type inside the ResampleBase abstract base class, and make stpipe a dependency?

I don't know what are the plans for these packages but I do not think they are intended to be general purpose algorithms like those in astropy (or even drizzle) so I do not see an issue with using structures shared by all pipelines.

emolter · 2024-08-23T15:26:21Z

src/stcal/resample/resample.py

+
+                try:
+                    if self.get_model_attr_value(model, "exptype").upper() != "SCIENCE":
+                        self._input_models.shelve(model, modify=False)


usually when there's a try...except clause inside a ModelLibrary context where the try could fail while models are borrowed from the library, it's necessary to put the shelve into a finally block so it executes regardless of the path the try...except takes

actually, if an exception is triggered then there is some other processing after the try-except block and then the model is closed. So, I think, the code is fine: this was intentional.

src/stcal/resample/resample.py

emolter · 2024-08-23T15:35:41Z

src/stcal/resample/resample.py

+    """
+    resample_suffix = 'i2d'
+    resample_file_ext = '.fits'
+    n_arrays_per_output = 2  # #flt-point arrays in the output (data, weight, var, err, etc.)


why is this hard-coded? Will it be different for co-add and single resamples? Don't we have multiple types of variance that need to be in memory simultaneously in the current version of resample, i.e., doesn't this underestimate the memory usage when it's used in check_memory_requirements?

emolter · 2024-08-23T15:37:20Z

src/stcal/resample/resample.py

+    def build_driz_weight(self, model, weight_type=None, good_bits=None):
+        """Create a weight map for use by drizzle
+        """
+        data = self.get_model_array(model, "data")


again here certain attributes of the model are assumed to exist, and IMO this should be made explicit either by having a model base class or by adding stcal dependencies (although we've been trying hard to avoid the latter)

again here certain attributes of the model are assumed to exist, and IMO this should be made explicit either by having a model base class or ...

Seems like this is not going to happen.

... by adding stcal dependencies (although we've been trying hard to avoid the latter)

I don't think this is necessary as this PR illustrates how to avoid imports.

Yes, it is true thst we assume some attributes to exist. I think for now we are safe but if you would like to harden the code, I could add try-except blocks around these data retrieval statements.

emolter · 2024-08-23T15:40:02Z

src/stcal/resample/resample.py

+
+    @abc.abstractmethod
+    def run(self):
+        ...


by marking it as an abstractmethod this is done automatically if you try to call it directly

emolter · 2024-08-23T15:45:32Z

src/stcal/resample/resample.py

+
+    Notes
+    -----
+    This routine performs the following operations::


this looks identical to ResampleCoAdd. Some doc updates should be made to clarify how they are different

perrygreenfield

Minor comments so far.

perrygreenfield · 2024-10-07T13:57:47Z

src/stcal/resample/utils.py

+        return tmeasure, True
+
+
+# FIXME: temporarily copied here to avoid this import:


Perhaps jwst should import it from stcal in future PR?

yes. I think that would be the right place.

perrygreenfield · 2024-10-07T18:50:41Z

src/stcal/resample/resample.py

+
+        # get the available memory
+        available_memory = (
+            psutil.virtual_memory().available + psutil.swap_memory().total


Should swap be included in this? If it is what I think it is, relying on it will slow things down.

I think this is ported over from current jwst main branch, but I don't think it does anything useful at present. See spacetelescope/jwst#8775

I wouldn't mind if it were removed as part of this PR, but the JIRA ticket is still assigned to David Law which means it hasn't been officially approved for work by the pipeline team yet

mcara requested a review from a team as a code owner August 8, 2024 13:37

mcara marked this pull request as draft August 8, 2024 13:37

mcara self-assigned this Aug 8, 2024

mcara mentioned this pull request Aug 8, 2024

Move common resample code to stcal spacetelescope/jwst#8695

Draft

8 tasks

kmacdonald-stsci approved these changes Aug 8, 2024

View reviewed changes

braingram reviewed Aug 23, 2024

View reviewed changes

emolter reviewed Aug 23, 2024

View reviewed changes

perrygreenfield reviewed Oct 8, 2024

View reviewed changes

mcara force-pushed the resample-common-code branch 2 times, most recently from bcead21 to 43493b1 Compare October 11, 2024 07:08

mcara added 9 commits October 25, 2024 09:59

Move common resample code to stcal

f082418

fix method names

45a49cc

Remove dependencies on stdatamodels, etc.

a3a5867

do not access model attributes directly

e1d4b8c

fix use of model attributes

b8b4132

Address reviewer comments

4b99b4a

Refactor previous code to work with arrays only

4864e12

refactor

7a7bd80

flake8

2da57ba

mcara force-pushed the resample-common-code branch from 43493b1 to 2da57ba Compare October 25, 2024 13:59

fix incorrect definition of pix ratio

55294b0

mcara mentioned this pull request Nov 27, 2024

Add common resample code to stcal. #320

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add common resample code to stcal #279

Add common resample code to stcal #279

mcara commented Aug 8, 2024

codecov bot commented Aug 8, 2024 •

edited

Loading

kmacdonald-stsci Aug 8, 2024

emolter Aug 23, 2024

kmacdonald-stsci Aug 8, 2024

braingram commented Aug 23, 2024

braingram Aug 23, 2024

mcara Aug 23, 2024

mcara Aug 23, 2024

braingram Aug 23, 2024 •

edited

Loading

mcara Aug 23, 2024

emolter left a comment

emolter Aug 23, 2024

mcara Aug 25, 2024

emolter Aug 23, 2024

mcara Aug 25, 2024

emolter Aug 23, 2024

mcara Aug 26, 2024

emolter Aug 23, 2024

emolter Aug 23, 2024

mcara Aug 26, 2024

emolter Aug 23, 2024

emolter Aug 23, 2024

perrygreenfield left a comment

perrygreenfield Oct 7, 2024

mcara Oct 9, 2024

perrygreenfield Oct 7, 2024

emolter Oct 8, 2024 •

edited

Loading

		@@ -0,0 +1,36 @@
		import numpy as np
		from stdatamodels.dqflags import interpret_bit_flags


		self.final_post_processing()

		self._output_model.write(self._output_filename, overwrite=True)

		"""Raised when the output is too large for in-memory instantiation"""


		def output_wcs_from_input_wcs(input_wcs_list, pixel_scale_ratio=1.0,

		return tmeasure, True


		# FIXME: temporarily copied here to avoid this import:

Add common resample code to stcal #279

Are you sure you want to change the base?

Add common resample code to stcal #279

Conversation

mcara commented Aug 8, 2024

codecov bot commented Aug 8, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

braingram commented Aug 23, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

braingram Aug 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emolter left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

perrygreenfield left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

emolter Oct 8, 2024 • edited Loading

Choose a reason for hiding this comment

codecov bot commented Aug 8, 2024 •

edited

Loading

braingram Aug 23, 2024 •

edited

Loading

emolter Oct 8, 2024 •

edited

Loading