Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactoring coreas to include interpolation #688

Open
wants to merge 53 commits into
base: develop
Choose a base branch
from

Conversation

cg-laser
Copy link
Collaborator

@cg-laser cg-laser commented Jun 5, 2024

Major refactoring of the coreas interface to read the electric field from coreas hdf5 files.
A new class readCoREASDetector.py can be used to read an array and get the interpolated position.

This is largely a copy of #685 which is an undefined state because of an accidental force push. Please never force-push unless you know what you are doing.

@cg-laser
Copy link
Collaborator Author

cg-laser commented Jun 5, 2024

@MijnheerD @lpyras I'm getting these warning from the interpolator module all the time:

Trace arrival times were not set during init, only relative timings are returned!
warning: negative values in abs_spectrum found: 6 times. Setting to zero.

are we not using the interpolator correctly?

@cg-laser
Copy link
Collaborator Author

@MijnheerD @lpyras I added an example of a "LOFAR"-style Xmax reco. The example is using a vertical shower. It would probably be good to also test with an inclined shower as some potential bugs might only appear then.

Before merging this PR, it would be good if the module would be tested a bit more. Is my understanding correct, that this module has not been used by anyone and is essentially untested?

@cg-laser
Copy link
Collaborator Author

The output of the reconstruction looks like this.
A remaining ToDo is a correct uncertainty estimation of the energy fluence. This should be implemented into the efieldSignalReconstructor

Xmax_fit_0
footprint_0_core0 000000_0 000000

@lpyras
Copy link
Collaborator

lpyras commented Aug 5, 2024

@MijnheerD @lpyras I'm getting these warning from the interpolator module all the time:

Trace arrival times were not set during init, only relative timings are returned!
warning: negative values in abs_spectrum found: 6 times. Setting to zero.

are we not using the interpolator correctly?

@MijnheerD I dont really know what that means. I always thought we are anyway only interested in the relative timing. I implemented access now the timing information provided by the cr-pulse-interpolator. Is there something else to do?

@lpyras
Copy link
Collaborator

lpyras commented Aug 5, 2024

@MijnheerD @lpyras I added an example of a "LOFAR"-style Xmax reco. The example is using a vertical shower. It would probably be good to also test with an inclined shower as some potential bugs might only appear then.

Before merging this PR, it would be good if the module would be tested a bit more. Is my understanding correct, that this module has not been used by anyone and is essentially untested?

I tested it before the coreas refactoring for effective area calculation. That yielded reasonable results.

@lpyras
Copy link
Collaborator

lpyras commented Aug 6, 2024

I changed everything we discussed so far e.g. change the coreas readers in a way that they only rely on the readCorsika7() function. Unfortunately I didn't have time to test everything. Any volunteers? Otherwise I try to do it in the next weeks. :)

@MijnheerD
Copy link
Collaborator

Sorry for coming in late on this, I thought I already answered some the questions but apparently I didn't.

@MijnheerD @lpyras I'm getting these warning from the interpolator module all the time:

Trace arrival times were not set during init, only relative timings are returned!
warning: negative values in abs_spectrum found: 6 times. Setting to zero.

are we not using the interpolator correctly?

@MijnheerD I dont really know what that means. I always thought we are anyway only interested in the relative timing. I implemented access now the timing information provided by the cr-pulse-interpolator. Is there something else to do?

This is important, because usually the relative timing is not enough. The problem is that for larger starshapes, the CoREAS time traces start at different times. We added the option to pass the start times to the interpolator, which it can then interpolate for you as well. The warning tells you that the start times were not provided to the constructor. On the lofar/interpolator branch we added this on line 65 of "readCoREASInterpolator.py". Then on line 191 we retrieve the start times, which are used in the make_sim_station function give the traces the correct start time.

@MijnheerD @lpyras I added an example of a "LOFAR"-style Xmax reco. The example is using a vertical shower. It would probably be good to also test with an inclined shower as some potential bugs might only appear then.

Before merging this PR, it would be good if the module would be tested a bit more. Is my understanding correct, that this module has not been used by anyone and is essentially untested?

I agree we should test it more, and I think once the LOFAR pipeline is done we will have an excellent playground for that. Although we can also already start testing on simulations, like the Xmax example (thanks @cg-laser for the script!). I am not sure when we will be able to get to it, but in September we should have a master student who will need to work with the interpolator so he might be able to test it more thoroughly.

@anelles
Copy link
Collaborator

anelles commented Aug 7, 2024

I just tested the branch and it actually fails with a pip installed cr_interpolator on:
File "/usr/local/lib/python3.12/site-packages/cr_pulse_interpolator/signal_interpolation_fourier.py", line 9, in
import interpolation_fourier as interpF
ModuleNotFoundError: No module named 'interpolation_fourier
I feel like I have seen this error discussed. Do we need a new cr_interpolator release?

return obs_positions_vxB_vxvxB


def read_CORSIKA7(input_file, declination=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can someone help me understand why the declination parameter is necessary here? Does it not come from the input_file itself?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without having checked the code and what the function does with the declination: No the declination does not come from the input_file. The normal corsika does not care about the declination at all as the coordinate system is defined by the magnetic north. There is the possibility to store the declination as meta data in an *reas file but it wont effect the simulation at all (its only intended to be stored along a simulation to use it once the simulation is converted to a different coordinate system in an reconstruction framework). Did that help?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. But one could get it from the radiotools helper function, which has the 3D magnetic field, right? Do I understand that correctly?

@lpyras
Copy link
Collaborator

lpyras commented Nov 25, 2024

Should we also update the pyproject.toml file to include the cr-pulse-interpolator library as dependency for the interpolation?

I think Sjoerd added it at some point. But directly the link to the GitHub rather than the pip installation.

Copy link
Collaborator

@MijnheerD MijnheerD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I redid major parts of the code and fixed up (almost) all of the docstings, so type hints should now work better as well. The major things that remain are the following:

  1. The fluence interpolation is difficult to get working, because one has to set a fluence parameter manually. I think we should have an option for the fluences to be calculated from the traces (maybe one provides a function that does the operation, so that you can choose how to do this).
  2. I don't like the fact that parameters such as zenith, azimuth, etc. are stored in multiple places (SimShower, SimStation and the ElectricFields). I think we should make a choice for each of them, and always retrieve it from that place.
  3. I have not checked the calculate_simulation_weights() function, because I have no idea what it is supposed to do.
  4. Should the plotting functions really be part of this module?

I also left some smaller comments in the code itself.

sim_station.set_parameter(stnp.azimuth, azimuth)
sim_station.set_parameter(stnp.zenith, zenith)
sim_station.set_parameter(stnp.cr_energy, energy)
sim_station.set_magnetic_field_vector(magnetic_field_vector)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I don't like the fact that we are storing this information twice (or maybe even three times), namely on the SimShower and SimStation level. Would it not be better if we just stored everything in the SimShower and use the SimStation only to store the traces?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no strong opinion. I though with read_CORSIKA7() all comes from one place. I guess @cg-laser has to say something about this.

NuRadioReco/modules/io/coreas/coreas.py Outdated Show resolved Hide resolved
"No high-level quantities in HDF5 file, not setting EM energy, this warning will be only printed once")
warning_printed_coreas_py = True
if coreas_sim_station.is_cosmic_ray():
sim_station.set_is_cosmic_ray()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep this? This will always be True if the Event is created with read_CORSIKA7

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also legacy. @cg-laser ?

efield_pos = []
for efield in efields:
efield_pos.append(efield.get_position())
efield_pos = np.array(efield_pos)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an array shaped like (nr_of_efields, 3), so it makes no sense to use this for the core. But I don't understand what the core is supposed to be here? Just the absolute position of the station? Would it not be better to remove the detector and station_id parameters and just allow the user to specify the core they want?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from before we had the interpolation. The idea was, that one uses a shower n_obs times to calculate the effective area. So to avoid interpolation we place the core in a way, that a certain observer is on top of a single detector station. I agree, that this is confusing but it serves its purpose. For this, also the calculate_simulation_weights() function was used. I will follow up in another comment.


return self.fluence_interpolator

def plot_footprint_fluence(self, radius=300):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be part of the interpolation module?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found it useful, but I don't insist on it :)

self.electric_field_on_sky,
signals_start_times=self.efield_times[:, 0],
lowfreq=(interp_lowfreq - 0.01) / units.MHz,
highfreq=(interp_highfreq + 0.01) / units.MHz,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the -/+ 0.01 ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a margin to avoid edge effects when applying filters. But I am not sure if it is really necessary. Feel free to delete it.

"""
fluence_per_position = [
np.sum(efield[quantity]) for efield in self.sim_station.get_electric_fields()
] # the fluence is calculated per polarization, so we need to sum them up
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be an option that this is calculated from the traces?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cg-laser ? I am happy with everything.

else:
return self.geo_star_radius

def initialize_efield_interpolator(self, interp_lowfreq, interp_highfreq):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the LOFAR version we had an optional parameter for the interpolation parameters. I think that was a clean solution to have some flexibility for the user. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you post a link to the LOFAR interpolation?

return fig, ax

def get_interp_efield_value(self, position_on_ground, core):
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am removing the core parameter, because I think it is very confusing. This module is purely about the interpolation,so everything should be from the reference frame of the original starshape IMHO. If one wants to shift the antenna positions, to account for a core shift, I think the should do this higher up and subtract the core from the position_on_ground value. What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, we can move the function to another place. I placed it here because the interpolator object is stored in that module. Ideally, it would live in readCoREASDetector.py I guess.

@MijnheerD
Copy link
Collaborator

I just added one important TODO item to the coreas.py file: when the shower comes from close to zenith or anywhere on the N-S axis, we should rotate the on-sky traces before interpolation. This avoids issues with the amplitudes becoming really small in some antennas, which the interpolator cannot really handle. So we should implement a method to rotate the traces back and forth.

@lpyras
Copy link
Collaborator

lpyras commented Dec 4, 2024

Thank you Mitja for putting in all the work :)
About the calculate_simulation_weights()

  1. I have not checked the calculate_simulation_weights() function, because I have no idea what it is supposed to do.

We used this function for the effective area calculation. Because of limited resources we only used one detector station and used the simulated showers multiple times. To avoid interpolation, we placed the shower core in a way, that the detector station is at the same position as the observer and determined the trigger efficiency. To figure out which area of the shower we would detect with that station we calculated a "cell" around that station that is represented by that observer position. For the effective area we summed up the cells on which we triggered for each energy and zenith bin. The function calculate_simulation_weights() calculated the area around an observer as shown in the figure below.

fig_simweights

For a whole detector array we used a different method, also to simulated coincidences. But as a first guess this works fine. I think we should keep it, since we are also keeping the readCoREASStation.py.

@@ -267,41 +267,7 @@ def read_CORSIKA7(input_file, declination=None):
zenith, azimuth, magnetic_field_vector = get_angles(corsika, declination)
energy = corsika['inputs'].attrs["ERANGE"][0] * units.GeV # Assume fixed energy

# Create RadioShower to store simulation parameters in Event
Copy link
Collaborator

@lpyras lpyras Dec 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if @cg-laser goes along with that. The idea was, that read_CORSIKA7() is the "mother" function from which all other functions are derived and which can easily be replaced by read_CORSIKA8()

@@ -426,63 +392,62 @@ def create_sim_station(station_id, evt, weight=None):
return sim_station


def make_sim_shower(corsika, observer=None, detector=None, station_id=None):
def make_sim_shower(corsika, declination=0):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function should have another name to allow for a deprecation warning. As mentioned before, I think it should be part of read_Corsika.

Comment on lines +104 to +105
shower_axis_vector = hp.spherical_to_cartesian(zenith / units.rad, azimuth / units.rad)
geomagnetic_angle = hp.get_angle(magnetic_field_vector, shower_axis_vector) * units.rad
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand you want to be "super safe" here. But I think the idea of the unit system of NuRadio is that once we read external data in we do not have to use units ever again until we start plotting or exporting data again. This also means we will never change the default units.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants