-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add common resample code to stcal #279
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #279 +/- ##
===========================================
- Coverage 86.21% 33.42% -52.80%
===========================================
Files 47 39 -8
Lines 8812 8818 +6
===========================================
- Hits 7597 2947 -4650
- Misses 1215 5871 +4656 ☔ View full report in Codecov by Sentry. |
src/stcal/resample/resample.py
Outdated
|
||
@abc.abstractmethod | ||
def run(self): | ||
... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using ellipses, would it be better to raise and exception with a message saying "method not implemented"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by marking it as an abstractmethod
this is done automatically if you try to call it directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this file necessary? It is imported by only one other file. It is not a set of common functions shared among many files. It's two short functions.
Would you rebase this and add a drizzle dependency with the branch for spacetelescope/drizzle#134 so we can see the CI run? |
src/stcal/resample/utils.py
Outdated
@@ -0,0 +1,36 @@ | |||
import numpy as np | |||
from stdatamodels.dqflags import interpret_bit_flags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from stdatamodels.dqflags import interpret_bit_flags | |
from stdatamodels.dqflags import interpret_bit_flags |
Is there a non-stdatamodels way to handle this. We can't make stdatamodels a dependency of this code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, there is. I'll fix it (I forgot to switch to astropy
although stdatamodels.dqflags.interpret_bit_flags()
for unknown to me reasons does things slightly different. I think that function should be removed altogether from stdatamodels.dqflags
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I did it in resample.py
but not in utils.
src/stcal/resample/resample.py
Outdated
|
||
self.final_post_processing() | ||
|
||
self._output_model.write(self._output_filename, overwrite=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work for roman since write
does not exist for roman_datamodels.DataModel
and overwrite
is not a valid keyword argument for roman_datamodels.DataModel.save
.
I think it's best if we leave all file-IO up to the pipeline and not include it here in stcal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that was an omission. I'll fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my main question is still the extent to which we assume the structure and attributes of datamodels, and now also of ModelLibrary
src/stcal/resample/resample.py
Outdated
"""Raised when the output is too large for in-memory instantiation""" | ||
|
||
|
||
def output_wcs_from_input_wcs(input_wcs_list, pixel_scale_ratio=1.0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might put this somewhere other than resample.py
, e.g. alignment/utils or alignment/resample_utils. I would vote (eventually) for having a separate submodule for WCS-related utility functions, but that's probably beyond the PR scope
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. It was temporary here. This will be fixed once we have a function that computes output WCS from s_region
of input data models and other parameters.
src/stcal/resample/resample.py
Outdated
|
||
# loop over only science exposures in the ModelLibrary | ||
# sci_indices = self._input_models.ind_asn_type("science") | ||
with self._input_models: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We seem to be in the same infinite loop as before here, about whether or not to make certain datamodels dependencies of stcal, either implicitly or explicitly. It seems to me that with this and other code lines, ResampleBase can only be used if the input models are assumed to be ModelLibrary instances. Are we okay with this? If so, should we just make this explicit in some way by specifying the input model type inside the ResampleBase
abstract base class, and make stpipe a dependency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We seem to be in the same infinite loop as before here, about whether or not to make certain datamodels dependencies of stcal, either implicitly or explicitly. It seems to me that with this and other code lines, ResampleBase can only be used if the input models are assumed to be ModelLibrary instances.
True. It is difficult to maximize the common code ported to stcal
with the purpose of reducing maintenance burden in the pipelines given the rigid usage imposed by the ModelLibrary
(by contrast, ModelContainer
was behaving like a standard list). I can try to see how to bury ModelLibrary
into the "IO" class, but it's not going to be pretty.
Are we okay with this? If so, should we just make this explicit in some way by specifying the input model type inside the
ResampleBase
abstract base class, and make stpipe a dependency?
I don't know what are the plans for these packages but I do not think they are intended to be general purpose algorithms like those in astropy
(or even drizzle
) so I do not see an issue with using structures shared by all pipelines.
src/stcal/resample/resample.py
Outdated
|
||
try: | ||
if self.get_model_attr_value(model, "exptype").upper() != "SCIENCE": | ||
self._input_models.shelve(model, modify=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usually when there's a try...except
clause inside a ModelLibrary context where the try
could fail while models are borrowed from the library, it's necessary to put the shelve
into a finally
block so it executes regardless of the path the try...except
takes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, if an exception is triggered then there is some other processing after the try-except
block and then the model is closed. So, I think, the code is fine: this was intentional.
src/stcal/resample/resample.py
Outdated
""" | ||
resample_suffix = 'i2d' | ||
resample_file_ext = '.fits' | ||
n_arrays_per_output = 2 # #flt-point arrays in the output (data, weight, var, err, etc.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this hard-coded? Will it be different for co-add and single resamples? Don't we have multiple types of variance that need to be in memory simultaneously in the current version of resample, i.e., doesn't this underestimate the memory usage when it's used in check_memory_requirements
?
src/stcal/resample/resample.py
Outdated
def build_driz_weight(self, model, weight_type=None, good_bits=None): | ||
"""Create a weight map for use by drizzle | ||
""" | ||
data = self.get_model_array(model, "data") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again here certain attributes of the model are assumed to exist, and IMO this should be made explicit either by having a model base class or by adding stcal dependencies (although we've been trying hard to avoid the latter)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again here certain attributes of the model are assumed to exist, and IMO this should be made explicit either by having a model base class or ...
Seems like this is not going to happen.
... by adding stcal dependencies (although we've been trying hard to avoid the latter)
I don't think this is necessary as this PR illustrates how to avoid imports.
Yes, it is true thst we assume some attributes to exist. I think for now we are safe but if you would like to harden the code, I could add try-except
blocks around these data retrieval statements.
src/stcal/resample/resample.py
Outdated
|
||
@abc.abstractmethod | ||
def run(self): | ||
... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by marking it as an abstractmethod
this is done automatically if you try to call it directly
src/stcal/resample/resample.py
Outdated
|
||
Notes | ||
----- | ||
This routine performs the following operations:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks identical to ResampleCoAdd. Some doc updates should be made to clarify how they are different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments so far.
return tmeasure, True | ||
|
||
|
||
# FIXME: temporarily copied here to avoid this import: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps jwst should import it from stcal in future PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. I think that would be the right place.
|
||
# get the available memory | ||
available_memory = ( | ||
psutil.virtual_memory().available + psutil.swap_memory().total |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should swap be included in this? If it is what I think it is, relying on it will slow things down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is ported over from current jwst main branch, but I don't think it does anything useful at present. See spacetelescope/jwst#8775
I wouldn't mind if it were removed as part of this PR, but the JIRA ticket is still assigned to David Law which means it hasn't been officially approved for work by the pipeline team yet
bcead21
to
43493b1
Compare
43493b1
to
2da57ba
Compare
This PR add the common resample code used by both JWST and Roman pipelines to stcal. Also, for the first time, this PR adopts the new
drizzle
API from spacetelescope/drizzle#134 for the resample code used in the pipelines.This work is related to https://jira.stsci.edu/browse/AL-835
At this moment this is a very rough draft for illustration purpose. It should run with default arguments (except input_models and output file name can be specified; everything else is not guaranteed to work). There are no unit/regression tests and documentation may not match the code.
Checklist
CHANGES.rst
(either inBug Fixes
orChanges to API
)