-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add aggregation CLI #1952
Add aggregation CLI #1952
Conversation
Codecov ReportAll modified lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1952 +/- ##
=======================================
Coverage 98.39% 98.39%
=======================================
Files 123 123
Lines 11808 11814 +6
=======================================
+ Hits 11618 11624 +6
Misses 190 190
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR adds functionality to enable aggregation of specified dimension coordinates by suitably wrapping the collapsed
function contained utilities/cube_manipulation.py
.
The code has been implemented in a general fashion so as to allow aggregation of any dimension(s), as well as broadcasting back to the original dimension coordinates. However, there are some considerations around threshold
coordinate based data and the metadata for broadcasted datasets that need further consideration.
Hopefully the changes to address these issues shouldn't require too much work. In the case of the threshold
dimension, I'm not advocating that quantities should be evaluated via integration of the probability distribution, rather that the quantities evaluated using the current approach are not the analogues of those evaluated using "realization" or "percentile" based data.
@benowen-bom Thanks for pointing out that Improver already does broadcasting. This allows us to simplify this change considerably. The code now just slightly extends the existing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making the updates from the previous review. The changes made have simplified the usage cases significantly and avoid the problematic cases I raised previously. I've got a few suggestions, but these are relatively minor.
Re unit-tests, given no new tests are added (at the module level), I think it's fine to use unittest
rather than pytest
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making those final changes. This PR looks good to me and provides a means to evaluate aggregate values from realization data, as needed for rainforests calibration.
I think rolling back the generalised elements was a good call. I think there is value in these, but they can be added when the need arises and we have the time to think through them carefully, particularly around threshold based data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @btrotta-bom 👍
I've added some minor comments.
improver_tests/utilities/cube_manipulation/test_collapse_realizations.py
Outdated
Show resolved
Hide resolved
improver_tests/utilities/cube_manipulation/test_collapse_realizations.py
Outdated
Show resolved
Hide resolved
improver_tests/utilities/cube_manipulation/test_collapse_realizations.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates, @btrotta-bom 👍
* add aggregation functionality * Handle case where std_dev is undefined * all dimensions to be a string * update cli and acceptance tests * Undo testing change * formatting * reverse testing change * update * change parameter name * fix docstring * Check dimension exists * Refactor to use collapse-realizations cli * Remove old file * Handle nans * fix bug * Remove check * simplify code * Check that realization is a dimension in cli * fix docstring * fix docstring * Add test for invalid method error * Add acceptance test for when new name not specified * formatting * update docstring * update docstring * update docstring * Use pytest * simplify code * formatting * update checksum * Remove argument. --------- Co-authored-by: Belinda Trotta <[email protected]> Co-authored-by: Gavin Evans <[email protected]>
Add CLI for iris aggregations sum, mean, median, std_dev, min, max. The code wraps the existing function
utilities.cube_manipulation.collapse_realizations
, adding the option to specify different aggregation methods (with "mean" as the default, so that existing code will continue to work).Testing: