Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source detection step option to fit model PSFs #841

Merged
merged 7 commits into from
Nov 9, 2023

Conversation

bmorris3
Copy link
Collaborator

@bmorris3 bmorris3 commented Aug 21, 2023

Description

This PR improves astrometry in the Source Detection step by (optionally) making use of PSF fitting methods added in #794.

The centroid accuracy of the PSF fitting methods depends on many factors, like the precision of the PSF model used in the fit (which may be traded off for longer runtimes), the flux of the source, the WFI filter, and to a lesser extent, the detector position, etc. The revised tests for source detection in this PR check the following criteria when using PSF fitting on bright sources in the F087 filter:

  • the median centroid residual is smaller than 3 milliarcseconds
  • the maximum centroid residual is smaller than 11 milliarcseconds
  • the recovered centroids and their errors are all <2-sigma consistent with the injected values

(see here in the diff for these assertions).

These upper-limits in the tests can be revised when we:

  1. Precompute PSF models and make them available (e.g. on CRDS). With more accurate (and more expensive to compute) PSF models, we can get more precise centroids, and revise the upper limits in the tests downwards.
  2. Have more specific requirements for a given filter and/or a given source flux

This PR supersedes #936.

Resolves RCAL-609

Closes #830
Closes #936

Checklist

  • added entry in CHANGES.rst under the corresponding subsection
  • updated relevant tests
  • updated relevant documentation
  • updated relevant milestone(s)
  • added relevant label(s)

@codecov
Copy link

codecov bot commented Aug 21, 2023

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Files Coverage Δ
romancal/lib/psf.py 76.34% <100.00%> (-8.77%) ⬇️
romancal/source_detection/source_detection_step.py 82.65% <86.95%> (+0.37%) ⬆️

📢 Thoughts on this report? Let us know!

@bmorris3 bmorris3 force-pushed the source-detection-psf branch 2 times, most recently from 62bdb19 to a1de71a Compare September 11, 2023 19:58
@bmorris3 bmorris3 force-pushed the source-detection-psf branch from a1de71a to 343f0d7 Compare September 21, 2023 19:22
@bmorris3 bmorris3 force-pushed the source-detection-psf branch 4 times, most recently from 47bdc1e to a6bc261 Compare November 2, 2023 13:33
@bmorris3 bmorris3 marked this pull request as ready for review November 2, 2023 13:49
@bmorris3 bmorris3 requested a review from a team as a code owner November 2, 2023 13:49
Copy link
Collaborator

@schlafly schlafly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Have you run any romanisim or regtest simulated images? I'm still worried a bit about the DAOStarFinder options, and it would be good to see that we detect most of the real, bright sources.

sources = daofind(self.data - bkg.background, mask=self.coverage_mask)
else:
sources = daofind(self.data, mask=self.coverage_mask)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Staring a little at the threshold logic, I'm confused. daofind doesn't do it natively, I think, but morally one wants to find all point sources that are more than N sigma above background, for N ~ 5. That looks potentially like what 'calc_threshold' mode is doing, but there it looks like the background is both part of the threshold and subtracted from the data, which seems like doubly subtracting. Meanwhile the mode without calc_threshold doesn't look like it has a notion of RMS and instead only has a notion of background, and the threshold is set exactly at the background?

I'm not sure there are any cases where we don't want to measure and remove a background, so I think I'd recommend just removing the calc_threshold = False branch, and then fixing the calc_threshold = True branch so that doesn't ~doubly subtract the background?

Sorry, I realize that this isn't new code, etc., but we should still fix it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I honestly did not understand what was happening in this block either. I added the else block that you've commented on here to make this PR work, but I agree that the logic needs revision.

catalog["xcentroid"].value,
catalog["ycentroid"].value,
catalog["flux"].value,
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how hard it is to check, and maybe we defer this to another PR, but we really want to use at least a structured array here eventually, as well as saving all the other columns that may have value.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will ping @PaulHuwe on Slack to ask about this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed that structured arrays are supported in ASDF. You can test for yourself with:

import numpy as np
import roman_datamodels.datamodels as rdd
from roman_datamodels.maker_utils import mk_level2_image

im = rdd.ImageModel(mk_level2_image())

idx = np.random.randint(0, 100, 10)
x = np.random.uniform(0, 100, 10)
y = np.random.uniform(0, 100, 10)

recarr = np.core.records.fromarrays(
    [idx, x, y], 
    names="idx, x, y", 
    formats=[int, float, float]
)

im.meta['test_recarr'] = recarr

im.to_asdf('test_recarr.asdf')

im_reloaded = rdd.open('test_recarr.asdf')

recarr2 = im_reloaded.meta['test_recarr']

assert np.all(recarr2 == recarr)

I propose that we make a follow-up PR for the move to structured arrays, since that will touch tweakreg a bunch. Thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good, thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as no one cares about accessing this information as text in the metadata, then this is great.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By default, this catalog is later deleted by tweakreg, so it shouldn't be useful to anyone unless they stop the pipeline part-way through.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely don't want a potentially huge catalog converted to text anywhere!

input_model.meta.source_detection[
"psf_catalog"
] = psf_photometry_table

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ultimately we want one L2 source catalog for each image with a ~fixed schema. That would argue for merging the PSF & initial finder results together---they're row-by-row matched, right? Could also be part of a separate PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we should have all results (source detection and optionally PSF) reported in one table. Since tweakreg currently expects an unstructured array, I won't implement the update in this PR (as discussed in #841 (comment)).

In the follow-up PR, we can have a structured array with columns that tweakreg uses, which will be the PSF fitting results if they're available and the DAO results if not, and also extra columns that tweakreg ignores.

@bmorris3 bmorris3 force-pushed the source-detection-psf branch from b8a8eb5 to 7b55518 Compare November 8, 2023 18:27
@github-actions github-actions bot added the dependencies Pull requests that update a dependency file label Nov 8, 2023
@bmorris3
Copy link
Collaborator Author

bmorris3 commented Nov 8, 2023

@schlafly: All related regression tests are passing (failures are unrelated).

Copy link
Collaborator

@schlafly schlafly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, looks good to me, thanks!

@bmorris3 bmorris3 merged commit cf17a69 into spacetelescope:main Nov 9, 2023
26 of 27 checks passed
@bmorris3 bmorris3 requested review from mairanteodoro and removed request for mairanteodoro November 10, 2023 18:41
@braingram
Copy link
Collaborator

@bmorris3 Do you recall the reasoning for the webbpsf exact pin?

"webbpsf == 1.2.1",

A new version (1.3.0) is available which appears to be compatible with numpy 2.0 (the version pinned here is not).

@braingram braingram mentioned this pull request Jun 24, 2024
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add PSF fitting option to Source Detection step
4 participants