Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add zonal_stats based on exactextract #62

Closed
martinfleis opened this issue Mar 27, 2024 · 3 comments · Fixed by #68
Closed

add zonal_stats based on exactextract #62

martinfleis opened this issue Mar 27, 2024 · 3 comments · Fixed by #68

Comments

@martinfleis
Copy link
Member

Dev version of exactextract is now on PyPI. We can try to wrap in our zonal_stats as another method alongside rasterio-based rasterize and iterate.

@masawdah
Copy link
Contributor

Hi, I took a look to the available exactextract package on PyPI to explore the possibility of integrating it into zonal statistics. Before diving into implementation, I'd like to share some initial thoughts:

  1. Regarding datacubes backed by xarray, it's important to note that exactextract supports only DataArray, not Dataset.

  2. The order of dimensions is important (longitude, latitude).

  3. The most important thing is that exactextract supports 2D or 3D cubes. For higher dimensions, such as 4D or more, alternative handling methods are necessary. I proposed stacking the additional dimensions into a single dimension, such as (long, lat, time, level1, level2) to (long, lat, stack_dim), applying exactextract, then unstacking the result to the original dimensions, in this way we avoid iteration through the vartiables and dimensions.

@martinfleis
Copy link
Member Author

Regarding datacubes backed by xarray, it's important to note that exactextract supports only DataArray, not Dataset.

Would that mean a loop over DataArrays within the Dataset if we wanted to do it all?

The order of dimensions is important (longitude, latitude).

We shall be able to check for that.

I proposed stacking the additional dimensions into a single dimension, [...] then unstacking the result to the original dimensions

That sounds reasonable.

Do you have any sense on performance compared to our existing methods?

Thanks for looking into that!

@masawdah
Copy link
Contributor

Would that mean a loop over DataArrays within the Dataset if we wanted to do it all?

once convert the dataset into an xarray.DataArray would be enough

Do you have any sense on performance compared to our existing methods?

Not yet. I'll see if we can compare them using a high-dimensional datacube or a large spatio-temporal extent. I might use the last use case from openEO where we struggled with memory issues.

@martinfleis martinfleis linked a pull request Jun 27, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants