New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Define API for reader and image_mapper #143

Draft

TjarkMiener wants to merge 54 commits into master from move_get_image

Member

TjarkMiener commented Sep 12, 2024

This PR suggests APIs for the reader and image_mapper. The configuration system and component scheme of ctapipe is adopted. Besides we adopt astropy tables for the batch generation. Reading is working in monoscopic and stereoscopic mode for DL1 images and R1 waveforms. The code properly process data with different array and telescope (divergent) pointings.

A subclass designed for the adv. trigger system processing R0 waveforms will be added in a separated PR.

–––
Closes #31 #104

TjarkMiener added 26 commits

July 19, 2024 21:00


          moved _get_image oustide the dl1dh reader

For looping over a given dl1 table and a single dl1 event (charges, peak times and mask), we can now retrieve the 2D images (input of the CNNs) without init and running the dl1dh reader.


          move _get_waveform() outside the dl1dh reader as well

144c5f5

also create separate function for the trigger patches on R0 data


          removed flip for LST-1 real because it is not needed

4b2e0ef

remove also apply IM functions because it can be now replace by the internal .map_image() function


          renamed get_mapped_trigger_patch to get_mapped_triggerpatch

c4888f6


          remove trigger for stereo example description

e8e930f


          remove prefix support for camera_frame

98821c0

If prefix camera_frame is in the file, the user should add this prefix to the config file.


          rename and join parameter_selection and event_selection to quality_se…

fd5fc84

…lection


          allow quality cuts for processing real data

3860d87


          major refactoring for batch generation

f1a6fb1

added batch generator

removed dl1dh transform

dl1dh provide now a static batch and we get the relevant information about the labels in the data loader of ctlearn


          removed redundant event and subarray info

679db24


          edit path to pointing table

fc55628

now stored in dl0 monitoring tree


          fix shape of trigger patch

a7755a3

last dimension of sample was missing


          fix get trigger features

93a9c09


          remove processor and transforms

b3ca9ed

replaced by new design


          fix stereo reading mode

31d6b53

Mainly astropy table operations are now used to retrieve the exmaple identifiers for the stereo reading mode. Code base is therefore heavily reduced and operations are more efficient.

Moved parameter settings outside the dl1dh. User can request to also the read dl1b  parameters by passing a list of column names in the batch_generation()

Removed init skip when pandas hdf5 with example identifiers is provided. It is not needed anymore since we are now fast and efficient with astropy tables and their operations.

split transformation into sub-functions for better readability.


          keep simulation info in an astropy table

this is removing redundant code

astopy table operations should be used a retrieved sum(), min or max etc.


          removed v5.0.0 support for real data and images

f174575


          use ctapipe SubarrayDescription for setting up

55bb05a

Everything related to the selection of the subarray is done with ctapipe now

Whenever a new file is processed, it checks the consistency of the SubarrayDescription to the reference which is the first provided file; this ensures that all files have the subarray.


          define reading API

1e3a86a

It defines a reading API with two childs for reading in mono and stereo mode. Then, the Image and Waveform childs inherits from both child classes (mono and stereo). Finally, the trigger child only inherits from the mono child


          added classes into __all__

98862ef


          pass batch to _get_features() and return feature dict and batch Table

c7f525e

this is somehow needed because we modify the batch Table for the Trigger subclass

fix trigger subclass


          remove redundant flip

cdb4674


          make image mapper methos as API

7bacf27


          simplify map_image()

d32d503


          make reader as API

0e33f24


          polish docstrings

ce5ee63

TjarkMiener added enhancement ctapipe ready for review labels

TjarkMiener requested a review from maxnoe

September 12, 2024 15:48

TjarkMiener added 3 commits

September 18, 2024 17:17


          get default image shape from the data

4d30599

removed magic numbers


          use f-string

a99bd9b


          make process type an enum

cf12cd0

TjarkMiener requested a review from Pablitinho

September 19, 2024 08:01


          added the data loader with keras api

d68559f

Pablitinho approved these changes

View reviewed changes

Collaborator

Pablitinho left a comment •

edited

Loading

LGTM

TjarkMiener mentioned this pull request

(Implementation) ParticleNet model ctlearn-project/ctlearn#207

Open

mexanick reviewed

View reviewed changes

mexanick left a comment

A few comments inline. It is quite difficult to comment on such a huge PR.
I suggest an implementation of a proper testing suite instead of a test notebook in order to automatize testing. Also, would be great to see a class diagram for the refactored ImageMapper showing what functionality is inherited, what is overloaded and what is extended.

dl1_data_handler/image_mapper.py

+                      self.camera_type = self.geometry.name
+                      self.n_pixels = self.geometry.n_pixels
+                      # Rotate the pixel positions by the pixel to align
+                      self.geometry.rotate(self.geometry.pix_rotation)

mexanick Sep 24, 2024

What is the purpose of this? If I understand correctly, pix_rotation is an angle, at which every pixel is rotated, but not necessarily the camera, as for that one there's cam_rotation. From the comment above this line, perhaps cam_rotation shall be used instead?

dl1_data_handler/image_mapper.py

-                                  self.image_shapes[camtype][1] + self.default_pad * 4,
-                                  self.image_shapes[camtype][2],
-                              )
+                  def _get_virtual_pixels(self, x_ticks, y_ticks, pix_x, pix_y):

mexanick Sep 24, 2024

while you only need the public functions docstrings for user documentation, the dev documentation will greatly benefit of having members documented. I'm with @Pablitinho on this. Just use copilot, you will be surprised how smart it can be ;)

dl1_data_handler/image_mapper.py

+                          self.geometry.pix_y.value, decimals=constants.decimal_precision
+                      )
+                      self.x_ticks = np.unique(self.pix_x).tolist()

mexanick Sep 24, 2024

Am I right you assume regularly spaced pixels for any kind of camera geometry? If not, did you test any geometry with shuffle step (e.g. square pixels where rows are shifted by e.g. 25%?)

dl1_data_handler/image_mapper.py

+                          self.pix_x, self.x_ticks = self._smooth_ticks(self.pix_x, self.x_ticks)
+                          self.pix_y, self.y_ticks = self._smooth_ticks(self.pix_y, self.y_ticks)
+                      # At the edges of the cameras some mapping methods run into issues.

mexanick Sep 24, 2024

Is this because your "ticks" maxes out at the maxima of pix_x, pix_y and do not take into account the pixel's area (border)?

dl1_data_handler/image_mapper.py

+                  def _create_virtual_hex_pixels(
+                      self, first_ticks, second_ticks, first_pos, second_pos
+                  ):
+                      """Create virtual hexagonal pixels outside of the camera."""

mexanick Sep 24, 2024

(even inline) will help

dl1_data_handler/image_mapper.py

+                          **kwargs,
+                      )
+                      if geometry.pix_type != PixelShape.HEXAGON:

mexanick Sep 24, 2024

why one can't oversample a square pixel grid?

dl1_data_handler/image_mapper.py

+                          **kwargs,
+                      )
+                      if geometry.pix_type != PixelShape.HEXAGON:

mexanick Sep 24, 2024

same question as above

dl1_data_handler/reader.py

    
                  ----------

                  quality_query : TableQualityQuery

                      An instance of TableQualityQuery to apply quality criteria to the data.

                  files : OrderedDict

mexanick Sep 24, 2024

Why do you need an OrderedDict here? I don't see any use of specific features for it in the code. On the other hand, a standard dict since python 3.6 retains original order of items.

dl1_data_handler/reader.py

    
                                  }

                              )

                              def _multiplicity_cut_tel_type(table, key_colnames):

                                  self.min_telescopes_of_type.attach_subarray(self.subarray)

mexanick Sep 24, 2024

bad design, you modify here some objects that are out of scope for this function. Also you don't use key_colnames local variable.

Member Author

TjarkMiener Oct 4, 2024

Astropy filter functionality requires a function with those two arguments. See here.

dl1_data_handler/reader.py

    
                          events = events.group_by(["obs_id", "event_id"])

                          def _multiplicity_cut_subarray(table, key_colnames):

                              return len(table) >= self.min_telescopes

mexanick Sep 24, 2024

why do you need key_colnames?

TjarkMiener marked this pull request as draft

October 16, 2024 07:29

TjarkMiener added 21 commits

October 16, 2024 13:33


          remove files from config and pass them in the contructor

8dd7241


          update to ctapipe v0.22.0

d92c906


          temp fix keras/TF bug

6d37efa


          bug fix

911d482


          added skeleton for on-the-fly waveform cleaning with digital sum and …

43547b2

…DBSCAN


          support keras2 & keras3

a953e4e


          bug fix; import keras

f6777b5


          log number of particles per type

83abe3d


          fix logger

36e7c4b


          set BilinearMapper as default

31167e0


          comment out qual cuts

ea22625


          add a sort and remove the shuffle in the loader for prediction

8cb4721


          fix proper treatment of channels with relative and cleaned options

b3543f8


          fix ToDo string

23eb274


          fix help for channels


          upgrade to ctapipe v0.23.0

3f1c07a


          retrieve trigger table also for MCs

ab5255a


          only apply transformation to sims data

473f82e


          add tel_id in sort

2a928f0


          fix on_epoch_end

2c0419c


          polish docstring

3dad0f9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

mexanick mexanick left review comments

Pablitinho Pablitinho approved these changes

maxnoe Awaiting requested review from maxnoe

kosack Awaiting requested review from kosack

nietootein Awaiting requested review from nietootein

BastienLacave Awaiting requested review from BastienLacave

Labels

ctapipe enhancement ready for review