Weighted Average #833

philipc2 · 2024-07-02T16:57:19Z

Closes #826

Overview

Implements weighted average functionality for a UxDataArray

review-notebook-app · 2024-07-02T16:57:24Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

philipc2 · 2024-07-02T16:58:03Z

@rytam2

I've set up the boilerplate for the weighted mean functionality. This should be a good place to get started. We can run over this during today's meeting.

…to weighted-mean

philipc2 · 2024-07-05T18:52:28Z

@rytam2

We have fixed the issue with the quad-hexagon grid. I've added it back to the test case.

…to weighted-mean

github-actions · 2024-07-17T16:49:35Z

ASV Benchmarking

Benchmark Comparison Results

Benchmarks that have improved:

Change	Before [`b11d011`]	After [`a0be1c1`]	Ratio	Benchmark (Parameter)
-	445M	375M	0.84	face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/quad-hexagon/grid.nc'))
-	467M	373M	0.80	mpas_ocean.Integrate.peakmem_integrate('480km')
	failed	412±7μs	n/a	mpas_ocean.WeightedMean.time_weighted_mean_face_centered('120km')
	failed	341±6μs	n/a	mpas_ocean.WeightedMean.time_weighted_mean_face_centered('480km')

Benchmarks that have stayed the same:

Before [`b11d011`]	After [`a0be1c1`]	Ratio	Benchmark (Parameter)
375M	376M	1.00	face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc'))
375M	375M	1.00	face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/scrip/outCSne8/outCSne8.nc'))
400M	379M	0.95	face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/geoflow-small/grid.nc'))
1.59±0.02s	1.58±0.01s	1.00	face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc'))
224±0.9ms	223±4ms	0.99	face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/scrip/outCSne8/outCSne8.nc'))
2.04±0.02s	2.01±0.02s	0.99	face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/geoflow-small/grid.nc'))
7.90±0.3ms	8.07±0.2ms	1.02	face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/quad-hexagon/grid.nc'))
3.02±0.03s	3.08±0.03s	1.02	import.Imports.timeraw_import_uxarray
674±20ms	669±7ms	0.99	mpas_ocean.ConnectivityConstruction.time_face_face_connectivity('120km')
41.9±0.6ms	42.1±0.5ms	1.01	mpas_ocean.ConnectivityConstruction.time_face_face_connectivity('480km')
1.83±0.03ms	1.82±0.03ms	0.99	mpas_ocean.ConnectivityConstruction.time_n_nodes_per_face('120km')
538±10μs	557±20μs	1.03	mpas_ocean.ConnectivityConstruction.time_n_nodes_per_face('480km')
1.12±0μs	1.06±0μs	0.95	mpas_ocean.ConstructTreeStructures.time_ball_tree('120km')
280±1ns	270±2ns	0.96	mpas_ocean.ConstructTreeStructures.time_ball_tree('480km')
770±4ns	759±6ns	0.99	mpas_ocean.ConstructTreeStructures.time_kd_tree('120km')
270±1ns	257±2ns	0.95	mpas_ocean.ConstructTreeStructures.time_kd_tree('480km')
432M	432M	1.00	mpas_ocean.GeoDataFrame.peakmem_to_geodataframe('120km', False)
407M	407M	1.00	mpas_ocean.GeoDataFrame.peakmem_to_geodataframe('120km', True)
379M	379M	1.00	mpas_ocean.GeoDataFrame.peakmem_to_geodataframe('480km', False)
393M	377M	0.96	mpas_ocean.GeoDataFrame.peakmem_to_geodataframe('480km', True)
1.02±0.01s	1.03±0.01s	1.01	mpas_ocean.GeoDataFrame.time_to_geodataframe('120km', False)
53.2±0.4ms	52.7±0.4ms	0.99	mpas_ocean.GeoDataFrame.time_to_geodataframe('120km', True)
78.0±0.3ms	79.3±1ms	1.02	mpas_ocean.GeoDataFrame.time_to_geodataframe('480km', False)
5.50±0.2ms	5.51±0.08ms	1.00	mpas_ocean.GeoDataFrame.time_to_geodataframe('480km', True)
319M	321M	1.01	mpas_ocean.Gradient.peakmem_gradient('120km')
296M	296M	1.00	mpas_ocean.Gradient.peakmem_gradient('480km')
2.79±0.02ms	2.79±0.06ms	1.00	mpas_ocean.Gradient.time_gradient('120km')
308±5μs	320±6μs	1.04	mpas_ocean.Gradient.time_gradient('480km')
389M	389M	1.00	mpas_ocean.Integrate.peakmem_integrate('120km')
182±5ms	177±1ms	0.97	mpas_ocean.Integrate.time_integrate('120km')
12.0±0.04ms	12.1±0.05ms	1.00	mpas_ocean.Integrate.time_integrate('480km')
342±7ms	347±4ms	1.01	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'exclude')
348±4ms	348±2ms	1.00	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'include')
343±3ms	344±4ms	1.00	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'split')
22.8±0.6ms	22.7±0.1ms	0.99	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'exclude')
22.8±0.4ms	23.0±0.2ms	1.01	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'include')
22.6±0.3ms	23.0±0.2ms	1.01	mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'split')
56.0±0.1ms	56.5±0.5ms	1.01	mpas_ocean.RemapDownsample.time_inverse_distance_weighted_remapping
45.7±0.2ms	45.9±0.2ms	1.01	mpas_ocean.RemapDownsample.time_nearest_neighbor_remapping
360±0.8ms	361±1ms	1.00	mpas_ocean.RemapUpsample.time_inverse_distance_weighted_remapping
266±2ms	264±0.2ms	0.99	mpas_ocean.RemapUpsample.time_nearest_neighbor_remapping
294M	294M	1.00	quad_hexagon.QuadHexagon.peakmem_open_dataset
291M	291M	1.00	quad_hexagon.QuadHexagon.peakmem_open_grid
5.58±0.2ms	6.24±0.5ms	~1.12	quad_hexagon.QuadHexagon.time_open_grid

Benchmarks that have got worse:

Change	Before [`b11d011`]	After [`a0be1c1`]	Ratio	Benchmark (Parameter)
+	6.78±0.4ms	7.93±1ms	1.17	quad_hexagon.QuadHexagon.time_open_dataset

…rray/weighted-mean (#866) * updated mean function with weighted arg * updated weighted-mean functionality in dataarray.py * edited weights to dask array --------- Co-authored-by: Rachel Yuen Sum Tam <[email protected]> Co-authored-by: Rachel Yuen Sum Tam <[email protected]>

uxarray/core/dataarray.py

philipc2 · 2024-10-22T18:24:02Z

test/test_weighted_mean.py

+    nt.assert_equal(result, expected_weighted_mean)
+
+
+def test_csne30_equal_area():


@rytam2

Can you write a test case for this using Dask?

Face Areas & Data is a dask array

philipc2 · 2024-10-22T18:26:29Z

uxarray/core/dataarray.py

+        weighted_mean = (self * weights).sum(axis=-1) / total_weight
+
+        # create a UxDataArray and return it
+        return UxDataArray(weighted_mean, uxgrid=self.uxgrid)


Need to preserve other parameters:

Coordiantes

…to weighted-mean reset local files

rytam2 · 2024-11-22T21:37:43Z

UXDataset support

aaronzedwick

Looks good! Just a few suggestions.

aaronzedwick · 2024-12-06T21:31:18Z

uxarray/core/dataarray.py

+        # compute the total weight
+        total_weight = weights.sum()
+
+        # compute weighted mean #assumption on index of dimension (last one is geometry)


Suggested change

# compute weighted mean #assumption on index of dimension (last one is geometry)

# compute the weighted mean, with an assumption on the index of dimension (last one is geometry)

aaronzedwick · 2024-12-06T21:32:18Z

uxarray/core/dataarray.py

+        This function calculates the weighted mean of a variable,
+        using the specified `weights`. If no weights are provided, it will automatically select
+        appropriate weights based on whether the variable is face-centered or edge-centered. If
+        the variable is neither face nor edge-centered.


Suggested change

the variable is neither face nor edge-centered.

the variable is neither face nor edge-centered a warning is raised, and an unweighted mean is computed instead.

aaronzedwick · 2024-12-06T21:40:14Z

test/test_weighted_mean.py

+    uxds = ux.open_dataset(quad_hex_grid_path, quad_hex_data_path_face_centered)
+
+    # expected weighted average computed by hand
+    expected_weighted_mean = 297.55


Can you compute this within Python? I think if you can avoid hard coding in the answer that is ideal, although I know this can't always be avoided.

+1 this! Even if it is inevitable to use constants, they shouldn't be showing up as magic numbers within the code and instead should go into some kind of test constants

boilerplate

e496054

philipc2 assigned rytam2 Jul 2, 2024

philipc2 added 5 commits July 2, 2024 13:42

update boilerplate

23c7a7b

fix boilerplate

3094cfc

Merge branch 'main' into weighted-mean

af8b865

add quad hexagon to tests

23a6422

Merge branch 'weighted-mean' of https://github.com/UXARRAY/uxarray in…

bd3248c

…to weighted-mean

philipc2 added 6 commits July 5, 2024 13:52

Merge branch 'main' into weighted-mean

98c1c90

Merge branch 'main' into weighted-mean

1ecb633

Merge branch 'main' into weighted-mean

ae4a13e

Merge branch 'main' into weighted-mean

e8e01a5

write tests, work on api design

5dbaf5d

asv benchmark

4c3290b

philipc2 added the run-benchmark Run ASV benchmark workflow label Jul 17, 2024

philipc2 added 3 commits July 17, 2024 09:28

Merge branch 'main' into weighted-mean

7e5ceca

update asv benchmark

6553163

Merge branch 'weighted-mean' of https://github.com/UXARRAY/uxarray in…

659cfe9

…to weighted-mean

Merge branch 'main' into weighted-mean

70bf239

rytam2 mentioned this pull request Jul 26, 2024

Committing weighted mean modifications from rtam/weighted mean to uxarray/weighted-mean #866

Merged

14 tasks

rytam2 and others added 2 commits July 26, 2024 17:43

Merge branch 'main' into weighted-mean

8a2ecb3

philipc2 linked an issue Aug 12, 2024 that may be closed by this pull request

Weighted Mean Functionality #826

Open

philipc2 commented Sep 13, 2024

View reviewed changes

uxarray/core/dataarray.py Outdated Show resolved Hide resolved

uxarray/core/dataarray.py Outdated Show resolved Hide resolved

uxarray/core/dataarray.py Outdated Show resolved Hide resolved

philipc2 removed the run-benchmark Run ASV benchmark workflow label Sep 15, 2024

philipc2 added 2 commits October 10, 2024 09:17

merge main

02dc504

some cleanup

f9a4d19

philipc2 added the run-benchmark Run ASV benchmark workflow label Oct 10, 2024

philipc2 and others added 2 commits October 10, 2024 09:35

fix tests

efce996

Merge branch 'main' into weighted-mean

87e6543

philipc2 commented Oct 22, 2024

View reviewed changes

Merge branch 'main' into weighted-mean

7999c43

philipc2 removed the run-benchmark Run ASV benchmark workflow label Nov 19, 2024

philipc2 and others added 6 commits November 19, 2024 12:05

Merge branch 'main' into weighted-mean

7722698

add initial dask test cases

8c36fa0

use parametrize

a3c87b8

add boilerplate for example in docstring

7056451

update docstring

944b42b

Merge branch 'weighted-mean' of https://github.com/UXARRAY/uxarray in…

9a1d746

…to weighted-mean reset local files

rytam2 and others added 2 commits November 22, 2024 16:08

added examples to weighted mean API, userguide and test case

48273fb

Merge branch 'main' into weighted-mean

c16f192

rytam2 marked this pull request as ready for review December 3, 2024 17:55

rytam2 and others added 5 commits December 3, 2024 12:04

cleaned userguide output

a5f0d3c

restarted kernel

97cdf92

removed duplicate notebook

70a2b3f

run pre-commit

d7a6d18

re-run notebook

1613a2d

rytam2 changed the title ~~DRAFT: Weighted Average~~ Weighted Average Dec 6, 2024

rytam2 requested review from aaronzedwick and erogluorhan December 6, 2024 20:58

aaronzedwick requested changes Dec 6, 2024

View reviewed changes

Merge branch 'main' into weighted-mean

b5ae01a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weighted Average #833

Weighted Average #833

philipc2 commented Jul 2, 2024 •

edited

Loading

review-notebook-app bot commented Jul 2, 2024

philipc2 commented Jul 2, 2024

philipc2 commented Jul 5, 2024

github-actions bot commented Jul 17, 2024 •

edited

Loading

philipc2 Oct 22, 2024

philipc2 Oct 22, 2024

rytam2 commented Nov 22, 2024

aaronzedwick left a comment

aaronzedwick Dec 6, 2024

aaronzedwick Dec 6, 2024

aaronzedwick Dec 6, 2024

erogluorhan Dec 7, 2024

		nt.assert_equal(result, expected_weighted_mean)


		def test_csne30_equal_area():

	# compute weighted mean #assumption on index of dimension (last one is geometry)
	# compute the weighted mean, with an assumption on the index of dimension (last one is geometry)

	the variable is neither face nor edge-centered.
	the variable is neither face nor edge-centered a warning is raised, and an unweighted mean is computed instead.

Weighted Average #833

Are you sure you want to change the base?

Weighted Average #833

Conversation

philipc2 commented Jul 2, 2024 • edited Loading

Overview

review-notebook-app bot commented Jul 2, 2024

philipc2 commented Jul 2, 2024

philipc2 commented Jul 5, 2024

github-actions bot commented Jul 17, 2024 • edited Loading

ASV Benchmarking

philipc2 Oct 22, 2024

Choose a reason for hiding this comment

philipc2 Oct 22, 2024

Choose a reason for hiding this comment

rytam2 commented Nov 22, 2024

aaronzedwick left a comment

Choose a reason for hiding this comment

aaronzedwick Dec 6, 2024

Choose a reason for hiding this comment

aaronzedwick Dec 6, 2024

Choose a reason for hiding this comment

aaronzedwick Dec 6, 2024

Choose a reason for hiding this comment

erogluorhan Dec 7, 2024

Choose a reason for hiding this comment

philipc2 commented Jul 2, 2024 •

edited

Loading

github-actions bot commented Jul 17, 2024 •

edited

Loading