Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix sync issues in model math / config / defaults attributes #610

Closed
wants to merge 1 commit into from

Conversation

irm-codebase
Copy link
Contributor

@irm-codebase irm-codebase commented Jun 19, 2024

Fixes #608 (and is a prerequisite for #606).

This PR is meant to fix potential issues due to duplication of model data at the model.attribute level and at the model._model_data.attrs level.

Instead of copying stuff, class @properties are used to call and modify specific things (math, config, defaults).
model.attributes should be used for data we want to lose between runs (instance-specific timestamps, temp flags, etc).

By drawing this distinction, the code should become easier to maintain down the line.

Summary of changes in this pull request

  • Turned model.math, model.config and model.defaults into properties that refer to model._model_data.attrs directly to avoid double instancing and potential desyncs
  • Removed depreciated methods to sync model.math and model.config with values in model._model_data.
  • Updated some tests.

Reviewer checklist

  • Test(s) added to cover contribution
  • Documentation updated
  • Changelog updated
  • Coverage maintained or improved

@irm-codebase irm-codebase marked this pull request as draft June 19, 2024 13:12
@irm-codebase irm-codebase self-assigned this Jun 19, 2024
@irm-codebase irm-codebase marked this pull request as ready for review June 19, 2024 15:01
@irm-codebase
Copy link
Contributor Author

@sjpfenninger @brynpickering
Here is a proposed improvement to avoid sync issues in the model object and its underlying xarray data.

I tried to update relevant tests, let me know if I missed something.

@irm-codebase irm-codebase changed the base branch from feature-base-math-override to main June 19, 2024 15:55
@irm-codebase irm-codebase added bug and removed bug labels Jun 19, 2024
@irm-codebase
Copy link
Contributor Author

Changed this PR to merge directly into main instead of the schema override feature.
I figured that getting this in will benefit other things beyond that.

Copy link

codecov bot commented Jun 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.93%. Comparing base (872978d) to head (8c92ba5).
Report is 44 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #610      +/-   ##
==========================================
+ Coverage   95.85%   95.93%   +0.07%     
==========================================
  Files          24       24              
  Lines        3619     3638      +19     
  Branches      788      736      -52     
==========================================
+ Hits         3469     3490      +21     
+ Misses         86       84       -2     
  Partials       64       64              
Files with missing lines Coverage Δ
src/calliope/model.py 93.41% <100.00%> (-0.24%) ⬇️
src/calliope/postprocess/postprocess.py 98.21% <100.00%> (+4.66%) ⬆️

... and 12 files with indirect coverage changes

@irm-codebase
Copy link
Contributor Author

My initial setup for testing the new linked properties was ugly and hard to maintain. I improved it following better testing practices.

@irm-codebase irm-codebase changed the title add property methods for math, config and defaults add linked property methods for math, config and defaults Jun 19, 2024
Copy link
Member

@brynpickering brynpickering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I added a slight clean-up of repetition, I'm not sure this is actually the way we should go with these properties.

config items should be completely frozen - you can view them using e.g. config.build[...] but you cannot set any values in them. Any runtime overrides should be kept to the model calls (model.build(**config_override_kwargs)).

defaults is one that a user could feasibly edit, but I'm aware that it is quite brittle at the moment. If you update a value in model.defaults for a parameter available in model._model_data, that change won't propagate (because those parameters have a default attribute assigned to it earlier on which takes precedence). So it should also probably be frozen for the time being.

model.math is the one I know you want to be able to replace as part of #609 and is the only one that may benefit from the ability to set it as an entirely new dictionary. However, inconsistencies can arise. If you build and solve a model and then update model.math and save your model, the model math is then out of sync with the model results and you'll be sharing potentially misleading datasets.

For all these properties, the question then remains that if we let users provide on-the-fly updates (e.g. the kwargs in model.build(**kwargs)), should they propagate back to the properties and be visible in those frozen dicts or should those stay frozen based on what was loaded from file and then we have some additional property that is build/solve-dependent, giving the config overrides that were applied.

@brynpickering
Copy link
Member

I'd prefer just two ways to define the config/defaults/math for a model: 1. in the YAML files, 2. at the point of method calls (Model(**kwargs), model.build(**kwargs), model.solve(**kwargs)). Model.config, Model.defaults, Model.math are then just handy references for the current state of the configuration.

@irm-codebase
Copy link
Contributor Author

irm-codebase commented Jun 20, 2024

@brynpickering 👍
Alright, I'll try to summarize the points. For what I understand, there are two issues

  • Issue 1 is whether or not we want model._model_data to be frozen. Particularly in these situations
    • Should kwargs be saved to it, or only affect the object temporarily?
    • Should we offer interfaces to modify this beyond YAML files and kwargs?
  • Issue 2 (this PR) is that model.attrs have inconsistent access to model._model_data.attrs. Python works with pointers, so in most cases modifying model.config will also modify model._model_data.attrs["config"]... but in our current approach this is not guaranteed!

My fix was not meant to be user-facing. It was meant to streamline the way we access data within our code. But I see why it could be seen that way...

Here are my proposed fixes to this PR in order to address Issue 2:

  • remove model.defaults: I included it because it was declared in model.__init__, but unless we ensure updates propagate it should definitely not be there.
  • alter model.math to model._math to specify users should not mess with it. We keep the setter and getter using @property to ensure consistency.
  • ditto for model.config
  • modify our code accordingly

With this we no longer have unclear attributes in model, and we have methods that streamline our code for the future. What do you think?

@brynpickering
Copy link
Member

I don't think they should be private properties (hence why they are public right now), but they should be immutable. Providing a setter goes against that idea as you essentially say "you can replace this entire dictionary if you like". It probably just makes sense to make the change you've implemented except for the setters. I.e., you don't provide a way for a user to completely replace those properties, although we still have no way of stopping them replacing the equivalent dictionary in model._model_data.attrs, which would have the same effect.

@brynpickering
Copy link
Member

Options for making these properties read-only:

  1. Use something like frozendict to make the dictonaries linked to the properties un-settable.
  2. pretty-print the dictionary instead of providing as a dictionary object (e.g. print(yaml.dump(m.config.init.as_dict()))).
  3. Add some switch in AttrDict to allow freezing (i.e. just suppressing the dictionary setter in there somehow).
  4. Turning the AttrDict into a set of nested ModelConfig objects that you can get (model.config.init.name), but you can't set

@irm-codebase
Copy link
Contributor Author

Python handles this automatically by creating a getter, but no setter. We already do this for model.is_solved.
I plan to follow a similar approach in the re-try.

@irm-codebase irm-codebase reopened this Jun 20, 2024
@irm-codebase
Copy link
Contributor Author

irm-codebase commented Jun 20, 2024

@brynpickering Retry!
This time I kept things simple: I've added getter @property that will let users see model math, defaults and configuration, but not modify them.

In theory, we could define "protected" @property version of these to make our code leaner (i.e., not always having to use self._model_data.attrs["yaddayadda"] for everything. For example: model._math could have a setter and a getter, while model.math only has a getter.

I've decided against this to keep the PR simple, and because it's a nice-to-have. Let me know if you want this included.

@irm-codebase irm-codebase changed the title add linked property methods for math, config and defaults Fix sync and memory duplication issues in model data and math / config / defaults attributes Jun 24, 2024
@irm-codebase irm-codebase marked this pull request as draft June 24, 2024 17:43
@irm-codebase irm-codebase changed the title Fix sync and memory duplication issues in model data and math / config / defaults attributes Fix sync issues in model math / config / defaults attributes Jun 24, 2024
@irm-codebase irm-codebase marked this pull request as ready for review June 24, 2024 18:03
@irm-codebase
Copy link
Contributor Author

This PR is related to #617 too. Unfortunately, fixing that is currently breaking operate mode. It's likely that fixing that issue involves too many changes, so this PR will be kept as-is to make the review easier.

@irm-codebase irm-codebase added the v0.7 (upcoming) version 0.7 label Jun 24, 2024
@irm-codebase
Copy link
Contributor Author

I've just realized that we are putting the model configuration in ANOTHER attribute: model._model_def_dict

This PR is going back to the drawing board, unfortunately.

@irm-codebase irm-codebase marked this pull request as draft June 25, 2024 09:27
@irm-codebase
Copy link
Contributor Author

This PR is replaced by #625.

@irm-codebase irm-codebase deleted the fix-config-desync branch July 8, 2024 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v0.7 (upcoming) version 0.7
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential desync in model configuration at the model.py level
2 participants