-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API re-design: simplify interface, remove I/O, make instrument agnostic #134
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #134 +/- ##
===========================================
+ Coverage 75.29% 97.27% +21.98%
===========================================
Files 7 3 -4
Lines 344 220 -124
===========================================
- Hits 259 214 -45
+ Misses 85 6 -79 ☔ View full report in Codecov by Sentry. |
Example usage: # wcs1 - WCS of the input image usually with distortions (to be resampled)
# wcs2 - WCS of the output image without distortions
import numpy as np
# simulate some data and a pixel map:
data = np.ones((240, 570))
y, x = np.indices((240, 570), dtype=np.float64)
pixmap = np.dstack([x, y])
# or call:
# from drizzle.utils import calc_pixmap
# pixmap = calc_pixmap(wcs1, wcs2)
# actual drizzle work
from drizzle.resample import Drizzle
d = Drizzle(out_shape=(240, 570))
d.add_image(data, exptime=1234, pixmap=pixmap)
# access outputs:
d.out_sci
d.out_ctx
d.out_wht |
b050a6e
to
7e40598
Compare
Could there be an option to fold the pixmap computation inside the the call to |
This is one possible approach to deal with large images. There also could be a different function (i.e., The beauty of using |
drizzle/resample.py
Outdated
out_sci : 2d array of float32, optional | ||
A 2D numpy array containing the output image produced by | ||
drizzling. On the first call it should be set to zero. | ||
Subsequent calls it will hold the intermediate results |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would resampling on a predefined image work, i.e. there is a resampled image from a previous run and we want to drizzle another image on it. Does this reseting to zero exclude the use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is old documentation that applies to the "traditional" workflow in which we start with an image that is all 0
s or NaN
s and keep adding inputs as needed. Initial out_sci
does not have to be all 0
s: it can be the previous "end result" upon which we can resample more inputs. In other words, we can restart the process from were we stopped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks very good, but I have some comments.
drizzle/utils.py
Outdated
# from https://trs.jpl.nasa.gov/handle/2014/40409 | ||
if refpix is None: | ||
if wcs.bounding_box is None: | ||
refpix = np.ones(wcs.pixel_n_dim) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand this line. In the absence of refpix
or a bounding box, wouldn't the center of the pixel array shape be the usual default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any pixel could work. But you are right: more conventional is to choose the center. I will update this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
more conventional is to choose the center
That's of course if one knows what the bounding box is.
return pixmap | ||
|
||
|
||
def _estimate_pixel_scale(wcs, refpix): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't there be a comment that says that the pixel scale assumes approximately equal scales for each axis? There could be odd detectors that this isn't true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, as I mentioned, this PR is at "conceptual" stage at this moment. Just to get an idea that proposed API changes look OK. More comments will be added. Documentation was only partially updated. For some parameters it is wrong.
from . import cdrizzle | ||
|
||
|
||
SUPPORTED_DRIZZLE_KERNELS = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the deletion of most of these kernels being held for a later PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, since no one approved on that JIRA issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tophat
was deprecated. I'd say remove it.
drizzle/resample.py
Outdated
"'out_shape' cannot be None when all output arrays are " | ||
"None." | ||
) | ||
out_shape = shapes[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused either about what sets allow or the code. I didn't think sets could be indexed. While I'm mentioning that, are the various "out" arrays permitted to have different shapes? If so, see comment below. Ah, apparently not. Maybe a comment is warranted when the set is created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a mistake. In all other places I used pop
. Will be fixed in the next commit.
drizzle/resample.py
Outdated
def blot_image(data, pixmap, pix_ratio, exptime, output_pixel_shape, | ||
interp='poly5', sinscl=1.0): | ||
""" | ||
Resample input image using interpolation to an output grid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this confusing in that the term input seems ambiguous. Is that the input of the original image, or the input of this function. I would start saying that this is intended to do the inverse resampling that drizzle does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are comments from the existing code. I did not modify them at this moment. I only moved the blot_image
out of the Drizzle
class as it does not belong to drizzle logically and does not need to be in a class.
I do not know who came up with the term blot
but it is definitely not me.
this is intended to do the inverse resampling that drizzle does.
I disagree. I think there is nothing "inverse" in here. Resampling is just resampling no matter the direction. Drizzle mixes values from several input pixels into one output pixel. The term "inverse" would suggest un-mingling those values to get the input (to drizzle) image, like Lucy-Richardson deconvolution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I think that an inverse mapping is tightly associated with the term "blot" (I'm not sure why it is called that either). Perhaps a round tripping is the right term, where it is understood that it is not a pure inverse. So I think my comment stands to make the issue clearer since those using blot are almost always using the inverse mapping, aren't they? If you got rid of blot
and forced people to use drizzling directly, then this wouldn't be an issue. But there is a history to this term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resampling algorithms (interpolating, like "blot", or drizzle) are all "one way" in the sense that you cannot apply it twice with "inverse mapping" (i.e., inverted pixmap) and expect things to "round-trip". Round-tripping will work with "blot" but only if image features can be perfectly (locally) modeled by linear/poly3/poly5 functions. With these kinds of images, in principle, we could use blot to resample from one frame to another and back. Reverse transformation in this case would be performed by exactly the same "blot" algorithm (and code). The only difference is in pixmap. Pixmap is always from the original (image's) coordinate system to the output (destination) coordinate system.
In any case,
I would start saying that this is intended to do the inverse resampling that drizzle does.
IMO is fundamentally incorrect: one cannot do "inverse resampling" of drizzled images using interpolation (possibly except for point kernel in the absence of distortions, etc.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"inverse resampling" suggests inverse operation to drizzle which is not what blot is doing. "blot" can be used both for forward (meaning resampling a distorted image onto an undistorted frame) transform and backward transform (from the distortion-corrected frame to the distorted frame). We usually use it for the "backward" transformation but it is not the same as "inverse": the final image is not the initial image used as input to drizzle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are talking at cross-purposes. What I mean is that blot is usually taken as mapping in the opposite direction to the original sampling. It is not an inverse of the original resampling, for sure. The point I was making was that there is ambiguity as to what input means, i.e., the original input image used for the original resampling, or the product in the destination pixel coordinate system to be mapped back to the original pixel coordinate system. In other words, with blot one could be confused as to which array is being referred to. It would be nice to clarify that in this case the input is the image generated from the original resampling (or whatever words describe that best). Maybe the current wording is obvious to everyone else, but it wasn't for me. Perhaps we should take a vote :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the docs can be improved here.
data : 2D array | ||
Input numpy array of the source image in units of 'cps'. | ||
|
||
pixmap : 3D array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the same array as used for drizzle or one that maps the other direction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is "inverse" map: it maps from the wcs_from
to wcs_to
.
pixmap : 3D array | ||
The world coordinate system to resample on. | ||
|
||
output_pixel_shape : tuple of int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar ambiguity here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Perry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from the comment on one of docstrings, this looks good. I don't want to hold that up due to that comment, which can always be addressed (or not) later.
drizzle/resample.py
Outdated
CTX_PLANE_BITS = 32 | ||
|
||
|
||
class Drizzle(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class Drizzle(): | |
class Drizzle: |
drizzle/resample.py
Outdated
if len(out_ctx.shape) == 3: | ||
shapes.add(out_ctx.shape[1:]) | ||
else: | ||
shapes.add(out_ctx.shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of the above if .... else
use out_ctx.shape[-2:]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦♂️
drizzle/resample.py
Outdated
else: | ||
expscale = exptime | ||
|
||
self.exptime += exptime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. It is an output for the user.
drizzle/utils.py
Outdated
# it afterwards because wcs co-ordinates are one based, while pixel co-ordinates | ||
# are zero based, The result is the final values in pixmap give a tranformation | ||
# between the pixel co-ordinates in the first image to pixel co-ordinates in the | ||
# co-ordinate system of the second. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the above comment be deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course it will be deleted. At this stage this PR is about API only. However, since it looks like you and @perrygreenfield approve this API redesign, I will now work on cleaning up comments and docstrings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it's time to remove this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot about this. will be fixed
drizzle/utils.py
Outdated
if wcs_from.pixel_shape is None: | ||
raise ValueError('The "from" WCS must have pixel_shape property set.') | ||
y, x = np.indices(wcs_from.pixel_shape, dtype=np.float64) | ||
x, y = wcs_to.invert(*wcs_from(x, y)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it's better to use the common API here instead of using gwcs.WCS methods.
Section "The Drizzle Library" in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor question that shouldn't affect approval.
@@ -110,7 +110,7 @@ def centroid_statistics(title, fname, image1, image2, amp, size): | |||
distances = centroid_distances(image1, image2, amp, size) | |||
indexes = (0, int(len(distances) / 2), len(distances) - 1) | |||
fd = open(fname, 'w') | |||
fd.write("*** %s ***\n" % title) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this and following cases is it necessary to add the type to the variable substitution when the variable is already that type? (Or is there a style commandment to always add type to these?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is not necessary but doing so is closest to the original code and probably is not needed, especially for the s
format (f
is different though).
416232c
to
781c317
Compare
drizzle/resample.py
Outdated
|
||
.. warning:: | ||
The "tophat" kernel was deprecated and it will be removed in a | ||
future release. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say just remove tophat in this PR.
drizzle/utils.py
Outdated
# it afterwards because wcs co-ordinates are one based, while pixel co-ordinates | ||
# are zero based, The result is the final values in pixmap give a tranformation | ||
# between the pixel co-ordinates in the first image to pixel co-ordinates in the | ||
# co-ordinate system of the second. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it's time to remove this comment?
drizzle/utils.py
Outdated
_estimate_pixel_scale(wcs_from, refpix_from)) | ||
return pixmap, pscale_ratio | ||
|
||
return pixmap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function returns a different number of outputs based on the parameters that are passed. How is downstream code to deal with this? What is the benefit of adding this option here?
from . import cdrizzle | ||
|
||
|
||
SUPPORTED_DRIZZLE_KERNELS = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tophat
was deprecated. I'd say remove it.
drizzle/resample.py
Outdated
A class for managing resampling and co-adding of multiple images onto a | ||
common output grid. The main method of this class is :py:meth:`add_image`. | ||
The main functionality of this class is to resample and co-add multiple | ||
images onto one output image. In the simplest terms, it redistributes flux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhere here it needs to mention that the resampling is performed using the drizzle
algorithm.
drizzle/resample.py
Outdated
out_img=out_img, | ||
out_wht=out_wht, | ||
out_ctx=out_ctx, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can create a Drizzle object with default values of parameters. In this case the _alloc_output_arrays
function does not work as expected. It does not generate arrays and does not raise an error.
|
||
if out_img is not None: | ||
out_img = np.asarray(out_img, dtype=np.float32) | ||
shapes.add(out_img.shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue with not handling the case of out_img is None
is that one can generate a default object which does not have an attribute Drizzle._out_img
.
drizzle/resample.py
Outdated
def _alloc_output_arrays(self, out_shape, max_ctx_id, out_img, out_wht, | ||
out_ctx): | ||
if out_shape is None: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps take the check outside this function.
drizzle/resample.py
Outdated
def blot_image(data, pixmap, pix_ratio, exptime, output_pixel_shape, | ||
interp='poly5', sinscl=1.0): | ||
""" | ||
Resample input image using interpolation to an output grid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the docs can be improved here.
pixmap : 3D array | ||
The world coordinate system to resample on. | ||
|
||
output_pixel_shape : tuple of int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Perry
Regression tests using this branch and These tests were failing because |
Regression tests on |
3c799cf
to
a003f84
Compare
Re-running regression tests after most recent commits:
|
Regression tests agains released version of Roman pipeline: With released Both tests seem to have the same 28 failures, mostly in dark current tests and related to missing data files (probably regression tests have evolved meanwhile). |
Closes #144
Closes #133
Closes #130
Closes #125
WARNING: This first commit is the first draft and it is definitely not ready for merging and unit tests have not been updated. Also, documentation is incomplete.This PR quite radically changes the API of this package with intention to make this package independent of how data are stored (FITS files, asdf, etc.) and the instrument used (JWST, HST, etc.) and also to reduce the number of functions and duplication and the number of conflicting parameters (or parameters that are changing meaning in different functions).
If some items on the TODO list (in the code) are implemented such as support for weight map of the input image to be
None
, then these changes would result in performance increase for EXPTIME, EXPSQ weighting schemes (as opposite to IVM). I would argue that we go even further with this in not having two parameters for input weights:weight_map
andwht_scale
. I think these should combine into a singleweight
which can be either an array or a scalar. IMO, having two parameters is an unnecessary complication.BUG FIX: exposure time was undefined when
in_units
were notcps
. That is,drizzle
was producing random outputs when input image units were, let's say'counts'
.BUG FIX:
tdriz
signature was out of date.It is also worth mentioning this issue that probably should be handled in this PR too: spacetelescope/drizzlepac#1722