Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated pargen.utils.load_data to use LH5Iterator and field_mask to be more memory efficient #589

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

iguinn
Copy link
Collaborator

@iguinn iguinn commented Aug 25, 2024

load_data was using too much memory (at least in P08 which is a larger dataset), and crashing my attempts to process it on NERSC. Switch it to use the fieldmask when reading from the input files and the LH5Iterator to limit the number of entries in memory at once. Also improved commenting/docstring.

Note this should not be merged until this is also merged: legend-exp/legend-pydataobj#100

Copy link

codecov bot commented Aug 25, 2024

Codecov Report

Attention: Patch coverage is 0% with 29 lines in your changes missing coverage. Please review.

Project coverage is 48.91%. Comparing base (981877e) to head (ee73d2a).

Files Patch % Lines
src/pygama/pargen/utils.py 0.00% 29 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #589      +/-   ##
==========================================
+ Coverage   48.80%   48.91%   +0.11%     
==========================================
  Files          59       59              
  Lines        7846     7821      -25     
==========================================
- Hits         3829     3826       -3     
+ Misses       4017     3995      -22     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gipert
Copy link
Member

gipert commented Oct 21, 2024

Is this tested @ggmarshall? @iguinn can you bump the pydataobj version in pyproject.toml, if this is not backward compatible?

@ggmarshall
Copy link
Collaborator

Not on my end but I can have a look later this week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants