Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iris <-> xarray conversion issues #379

Open
w-k-jones opened this issue Dec 4, 2023 · 4 comments
Open

iris <-> xarray conversion issues #379

w-k-jones opened this issue Dec 4, 2023 · 4 comments
Labels
xarray transition Part of the transition to xarray support
Milestone

Comments

@w-k-jones
Copy link
Member

w-k-jones commented Dec 4, 2023

Keeping track of issues encountering when converting between iris Cubes and xarray DataArrays:

  • Round trip conversion (xarray -> iris -> xarray) of integer DataArrays causes either a TypeError for xarray < v2023.06 or RuntimeWarning for xarray >= v2023.06. This is due to the core data being converted to a masked array when converting to an iris Cube, and then xarray trying to fill said array with np.nan when converting back
    • Workaround: copy cube with array-like core_data when converting from xarray to iris:
      cube = da.to_iris().copy(da.data)
  • dims without coords get converted to default unnamed dim names during round trip conversion (xarray -> iris -> xarray)
    • e.g. ("x", "y") -> ("dim_0", "dim_1")
    • Need to save origin dim names and remap back after output
  • xarray uses coord var_name, whereas iris uses standard_name
    • Need to map between the two to match input standard for pandas dataframe output column names
    • standard_name is not necessarily unique, which can cause problems in tobac. e.g. GOES ABI data
  • cftime 😡
    • Conversion of Dataframe cftime column to np.datetime64:
      xr.CFTimeIndex(features["time"].to_numpy()).to_datetimeindex()
  • iris cannot handle sub-second (e.g. ns) time values
    • Need to convert times to np.datetime64[s] before converting to iris
    • Need to map back to origin time values in output DataArrays/Dataframes to ensure functions like bulk_statistics work correctly

Please add any more issues that crop up

@w-k-jones w-k-jones added the xarray transition Part of the transition to xarray support label Dec 4, 2023
@freemansw1
Copy link
Member

For #354, I'm working on a modification to the decorators that passes an optional hidden kwarg that notes whether an iris->xarray conversion occurred, to hopefully help with some of these issues.

@w-k-jones
Copy link
Member Author

Sounds like a good plan. Would it make sense to split the decorator changes into a separate branch/PR so they can be worked on concurrently with converting other sections of the code to xarray?

Also most of the conversion issues seem to affect converting from xarray to iris and then back again, so long term hopefully these should be less of an issue once the internals are switched to xarray

@freemansw1
Copy link
Member

Sounds like a good plan. Would it make sense to split the decorator changes into a separate branch/PR so they can be worked on concurrently with converting other sections of the code to xarray?

I can do that once it's in a working condition.

@freemansw1
Copy link
Member

freemansw1 commented Dec 5, 2023

Sounds like a good plan. Would it make sense to split the decorator changes into a separate branch/PR so they can be worked on concurrently with converting other sections of the code to xarray?

I can do that once it's in a working condition.

This is now available in #380

@freemansw1 freemansw1 added this to the Version 1.6 milestone Sep 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
xarray transition Part of the transition to xarray support
Projects
None yet
Development

No branches or pull requests

2 participants