-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Commissioning fixes and refactoring for re-extraction using the frontend UI. #30
Changes from 9 commits
461cabf
c416caa
e6e3935
7ea7521
ceb1305
5487148
0070d28
cd02816
fab2e53
9c69174
cc8bf92
6d39c92
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,106 @@ | ||
import numpy as np | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I rewrote the background subtraction stage entirely (and moved it to the background.py file). I have separated this logic from the extraction logic to make it more modular to replace and test. I tried a wide variety of bsplines and two fits here without success. The scipy bplines either had significant issues with the number of points we are fitting in the whole 2d frame or could not capture the variation near sky line edges (the key reason to use 2d fits from Kelson 2003). I was originally using 2d polynomials, but to get the order high enough to capture the variation in the skyline edges, I was introducing significant ringing in the fits (to the point of oscillating between positive and negative values in the data). I am now doing something closer to what IRAF did, interpolating the background regions on the wavelength bin centers, fitting a 1d polynomial, and interpolating back on the original wavelengths to subtract per pixel. In this way, it is only the background model that is interpolated and not the pixel values themselves. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This would be good to capture somewhere in the documentation. |
||
from astropy.table import Table, vstack | ||
from scipy.interpolate import CloughTocher2DInterpolator | ||
from banzai.stages import Stage | ||
from numpy.polynomial.legendre import Legendre | ||
|
||
|
||
def fit_background(data, background_order=3): | ||
data['data_bin_center'] = 0.0 | ||
data['uncertainty_bin_center'] = 0.0 | ||
for order in [1, 2]: | ||
in_order = data['order'] == order | ||
|
||
data_interpolator = CloughTocher2DInterpolator(np.array([data['wavelength'][in_order], | ||
data['y_profile'][in_order]]).T, | ||
data['data'][in_order].ravel(), fill_value=0) | ||
uncertainty_interpolator = CloughTocher2DInterpolator(np.array([data['wavelength'][in_order], | ||
data['y_profile'][in_order]]).T, | ||
data['uncertainty'][in_order].ravel(), fill_value=0) | ||
|
||
data['data_bin_center'][in_order] = data_interpolator(data['wavelength_bin'][in_order], | ||
data['y_profile'][in_order]) | ||
data['uncertainty_bin_center'][in_order] = uncertainty_interpolator(data['wavelength_bin'][in_order], | ||
data['y_profile'][in_order]) | ||
|
||
# Assume no wavelength dependence for the wavelength_bin = 0 and first and last bin in the order | ||
# which have funny edge effects | ||
background_bin_center = [] | ||
for data_to_fit in data.groups: | ||
if data_to_fit['wavelength_bin'][0] == 0: | ||
continue | ||
# Catch the case where we are an edge and fall outside the qhull interpolation surface | ||
if np.all(data_to_fit['data_bin_center'] == 0): | ||
data_column = 'data' | ||
uncertainty_column = 'uncertainty' | ||
else: | ||
data_column = 'data_bin_center' | ||
uncertainty_column = 'uncertainty_bin_center' | ||
in_background = data_to_fit['in_background'] | ||
in_background = np.logical_and(in_background, data_to_fit[data_column] != 0) | ||
polynomial = Legendre.fit(data_to_fit['y_profile'][in_background], data_to_fit[data_column][in_background], | ||
background_order, | ||
domain=[np.min(data_to_fit['y_profile']), np.max(data_to_fit['y_profile'])], | ||
w=1/data_to_fit[uncertainty_column][in_background]**2) | ||
|
||
background_bin_center.append(polynomial(data_to_fit['y_profile'])) | ||
|
||
data['background_bin_center'] = 0.0 | ||
data['background_bin_center'][data['wavelength_bin'] != 0] = np.hstack(background_bin_center) | ||
|
||
results = Table({'x': [], 'y': [], 'background': []}) | ||
for order in [1, 2]: | ||
in_order = np.logical_and(data['order'] == order, data['wavelength_bin'] != 0) | ||
background_interpolator = CloughTocher2DInterpolator(np.array([data['wavelength_bin'][in_order], | ||
data['y_profile'][in_order]]).T, | ||
data['background_bin_center'][in_order], fill_value=0) | ||
background = background_interpolator(data['wavelength'][in_order], data['y_profile'][in_order]) | ||
# Deal with the funniness at the wavelength bin edges | ||
upper_edge = data['wavelength'][in_order] > np.max(data['wavelength_bin'][in_order]) | ||
background[upper_edge] = data[in_order]['background_bin_center'][upper_edge] | ||
lower_edge = data['wavelength'][in_order] < np.min(data['wavelength_bin'][in_order]) | ||
background[lower_edge] = data['background_bin_center'][in_order][lower_edge] | ||
order_results = Table({'x': data['x'][in_order], 'y': data['y'][in_order], 'background': background}) | ||
results = vstack([results, order_results]) | ||
# Clean up our intermediate columns for now | ||
data.remove_columns(['data_bin_center', 'uncertainty_bin_center', 'background_bin_center']) | ||
return results | ||
|
||
|
||
def set_background_region(image): | ||
if 'in_background' in image.binned_data.colnames: | ||
return | ||
|
||
image.binned_data['in_background'] = False | ||
for order_id in [2, 1]: | ||
in_order = image.binned_data['order'] == order_id | ||
this_background = np.zeros(in_order.sum(), dtype=bool) | ||
data = image.binned_data[in_order] | ||
for background_region in image.background_windows[order_id - 1]: | ||
in_background_reg = data['y_profile'] >= (background_region[0] * data['profile_sigma']) | ||
in_background_reg = np.logical_and(data['y_profile'] <= (background_region[1] * data['profile_sigma']), | ||
in_background_reg) | ||
this_background = np.logical_or(this_background, in_background_reg) | ||
image.binned_data['in_background'][in_order] = this_background | ||
for order in [1, 2]: | ||
for reg_num, region in enumerate(image.background_windows[order - 1]): | ||
image.meta[f'BKWO{order}{reg_num}0'] = ( | ||
region[0], f'Background region {reg_num} for order:{order} minimum in profile sigma' | ||
) | ||
image.meta[f'BKWO{order}{reg_num}1'] = ( | ||
region[1], f'Background region {reg_num} for order:{order} maximum in profile sigma' | ||
) | ||
|
||
|
||
class BackgroundFitter(Stage): | ||
DEFAULT_BACKGROUND_WINDOW = (7.5, 15) | ||
|
||
def do_stage(self, image): | ||
if not image.background_windows: | ||
background_window = [[-self.DEFAULT_BACKGROUND_WINDOW[1], -self.DEFAULT_BACKGROUND_WINDOW[0]], | ||
[self.DEFAULT_BACKGROUND_WINDOW[0], self.DEFAULT_BACKGROUND_WINDOW[1]]] | ||
image.background_windows = [background_window, background_window] | ||
set_background_region(image) | ||
background = fit_background(image.binned_data) | ||
image.background = background | ||
return image |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
from banzai.stages import Stage | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I broke the data binning into its own stage to make testing easier and more modular. |
||
from banzai_floyds.utils.binning_utils import bin_data | ||
|
||
|
||
class Binning(Stage): | ||
def do_stage(self, image): | ||
image.binned_data = bin_data(image.data, image.uncertainty, image.wavelengths, | ||
image.orders, image.mask) | ||
return image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed some lines in the wavelength solution that were blended. It's fine to include blended lines when we are doing the simultaneous fit of all the lines, but the initial centroiding per line to get the initial wavelength solution parameters can have catastrophic failures. I would like to add a category of lines in the future that doesn't centroid initially but does include the line in the full solution fit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to recognize a catastrophic failure reliably without human intervention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not so far. I would like to wait on that as a future improvement as we get some experience with how the data quality looks.