Unable to save `model.results` directly #684

irm-codebase · 2024-09-11T13:16:58Z

What happened?

Problem

This is related to other problems (#619), but it's closer to an actual user case, so I'll document it here.

Due to our approach of saving everything in the .attrs of the xarray, we cannot directly save model results.
This is a reasonable user-case, as saving results is semantically intuitive.

Solution

Not saving dictionaries or arrays in the .attrs

Which operating systems have you used?

macOS
Windows
Linux

Version

v0.7.0.dev4

Relevant log output

model.results.to_netcdf(results_path / "results.nc")
Traceback (most recent call last):
  File "/home/ivanruizmanuel/Documents/git/euro-calliope-modular/build/model/run_save.py", line 14, in <module>
    model.results.to_netcdf(results_path / "results.nc")
  File "/home/ivanruizmanuel/miniforge3/envs/ec-model-v07/lib/python3.12/site-packages/xarray/core/dataset.py", line 2298, in to_netcdf
    return to_netcdf(  # type: ignore  # mypy cannot resolve the overloads:(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ivanruizmanuel/miniforge3/envs/ec-model-v07/lib/python3.12/site-packages/xarray/backends/api.py", line 1292, in to_netcdf
    _validate_attrs(dataset, invalid_netcdf=invalid_netcdf and engine == "h5netcdf")
  File "/home/ivanruizmanuel/miniforge3/envs/ec-model-v07/lib/python3.12/site-packages/xarray/backends/api.py", line 203, in _validate_attrs
    check_attr(k, v, valid_types)
  File "/home/ivanruizmanuel/miniforge3/envs/ec-model-v07/lib/python3.12/site-packages/xarray/backends/api.py", line 195, in check_attr
    raise TypeError(
TypeError: Invalid value for attr 'scenario': None. For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple

The text was updated successfully, but these errors were encountered:

sjpfenninger · 2024-09-12T06:52:39Z

What is the use case that makes this a bug? In other words, why can't you use model.to_netcdf()?

irm-codebase · 2024-09-12T08:57:22Z

In this case, I wanted to just save the results to have a smaller file size. I did save with model.to_netcdf().

If you have a stable model build and run many scenarios (e.g., SPORES), saving only the model results, and some metadata on the side, would reduce data use.

Generally, though, just saving the results or inputs makes parsing the data a bit easier. Depending on the mode, some things might be inputs or results (e.g., flow_cap in operate mode). Users might not have the math fully memorized, so just saving the results is a bit more 'human friendly'.

brynpickering · 2024-09-17T14:03:33Z

Maybe this is why we should simply pickle our data - then it would be clear that you need to load it back in using calliope 😅.

But in all seriousness, our preferred method is that data is always loaded in using calliope (read_netcdf). This way, you automatically get to see what is an input/result, as they are attached as respective attributes of the model object. If we made it easy to save results to file then we'd only come across the opposite problem in future: users trying to load a netcdf file using calliope and then finding it doesn't contain any of the necessary metadata to actually instatiate the model object.

Generally, input data has a much smaller memory footprint than results. Still, in the use-case you have given (individual SPORES runs), it's clearly preferable not to save the input data every single time a new set of results is generated. How about calliope.to_netcdf(include_inputs=False)?

irm-codebase · 2024-09-17T14:33:38Z

@brynpickering I would not pickle the data. Pickle files are not always translatable between python or library versions.
It would decrease stability in the long run 😅😅...

I'd actually prefer to just keep calliope.to_netcdf() without anything. Adding extra things would just decrease maintainability instead of fixing the root of the issue, which is that we are not following netCDF4 specifications.

I suppose it would introduce the chance of someone trying to re-load results... but that is user misconception, not calliope code breaking compatibility with data formats. I do not see it as a big loss, but I get the point.

Should I close this as 'not planned'?

brynpickering · 2024-09-17T14:53:14Z

I'd actually prefer to just keep calliope.to_netcdf() without anything

Except you also want to add calliope.results.to_netcdf, right?

irm-codebase · 2024-09-17T14:57:49Z

I meant that I'm ok with just using calliope.to_netcdf() if it means we do not sideline the spec issue with calliope.to_netcdf(include_inputs=False) 🙈.

I can get around it in other ways on my side, and if it has not popped up yet it means most people do not do it.

irm-codebase · 2024-09-17T15:10:06Z

Just to clarify: we already have calliope.results.to_netcdf(), since results is an xarray.
So I was not really requesting us to add any new functions.

This issue was just about making sure using xarray functionality with it works when saving to netCDF.

irm-codebase · 2024-09-30T15:26:16Z

I'm setting this as 'not planned'.
Perhaps the math cleanup turns this into a non-issue.

irm-codebase added the bug label Sep 11, 2024

irm-codebase closed this as not planned Won't fix, can't repro, duplicate, stale Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to save `model.results` directly #684

Unable to save `model.results` directly #684

irm-codebase commented Sep 11, 2024

sjpfenninger commented Sep 12, 2024

irm-codebase commented Sep 12, 2024 •

edited

Loading

brynpickering commented Sep 17, 2024

irm-codebase commented Sep 17, 2024 •

edited

Loading

brynpickering commented Sep 17, 2024

irm-codebase commented Sep 17, 2024 •

edited

Loading

irm-codebase commented Sep 17, 2024 •

edited

Loading

irm-codebase commented Sep 30, 2024

Unable to save model.results directly #684

Unable to save model.results directly #684

Comments

irm-codebase commented Sep 11, 2024

What happened?

Problem

Solution

Which operating systems have you used?

Version

Relevant log output

sjpfenninger commented Sep 12, 2024

irm-codebase commented Sep 12, 2024 • edited Loading

brynpickering commented Sep 17, 2024

irm-codebase commented Sep 17, 2024 • edited Loading

brynpickering commented Sep 17, 2024

irm-codebase commented Sep 17, 2024 • edited Loading

irm-codebase commented Sep 17, 2024 • edited Loading

irm-codebase commented Sep 30, 2024

Unable to save `model.results` directly #684

Unable to save `model.results` directly #684

irm-codebase commented Sep 12, 2024 •

edited

Loading

irm-codebase commented Sep 17, 2024 •

edited

Loading

irm-codebase commented Sep 17, 2024 •

edited

Loading

irm-codebase commented Sep 17, 2024 •

edited

Loading