-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use pydantic
for the whole model definition
#726
Comments
@sjpfenninger @brynpickering In general, I think the changes on the user side should be minimal. The relevant ones relate to math definition and its usage, but I believe they would be easy to understand. |
The main benefits I see to moving to pydantic are:
|
This all looks fine to me and the benefits are clear. Based on some of the detailed discussions, I had the impression that this might require undoing some of the dict flattening - but it seems like that's not really the case? |
@sjpfenninger in the proposal above, not many. There are two cases that warrant a bit more discussion, though. Isolating 'mutable' YAML data portions.This is something I noticed while writing this summary proposal. I think it is a small, but sensible, change. At the moment the definition of 'dimension' data like config: # structural, contains stuff that affects code behaviour
math: # structural, contains stuff that affects backend mathematics
techs: # data for a dimension
nodes: # data for a dimension
data_tables: # structural, contains stuff that defines extraction from files This will become troublesome once we start allowing users to define dimensions of their own, because we do not know the names they might use! Over time we might want to allow users to define parameters in the YAML for these dimensions too. Ideally, we want to avoid mixing known and unknown names at the same level of the schema. My suggestion is to define this type of 'YAML data' under a known key, so our schema / math flexibility can develop sensibly over time without further changes. Even if we do not allow new dimensions, this change ensures our schema is 'future proof' against this mutability. config: # structural, contains stuff that affects code behaviour
math: # structural, contains stuff that affects backend mathematics
data: # structural, contains data definition in YAML
techs: # data for a dimension
nodes: # data for a dimension
vintages: # data for a future 'vintages' dimension
data_tables: # structural, contains stuff that defines extraction from files Isolating 'mutable' user passthrough dataThis has already been discussed here #717 (comment) and is similar to the case above. I'm only mentioning it for completeness. Something similar happens when allowing 'passthrough' user data (#709). Some of it relates to numerical parameters, others to 'logic' in the math, and other to data users just want to specify. Ideally, you do not want to mix them to reduce possible user mistakes and to avoid complex logic on our side. This might be particularly important for the techs:
pv:
flow_out_eff: 0.3 # numerical parameter in the opt. backend
base_tech: supply # used in math 'logic', like where:
model_name: Vertex S+ # user provided stuff Allowing this 'mix' is possible (@brynpickering already gave a good approach here #717 (comment)), but this 'mixing' is still kind of dangerous from a SW robustness perspective. It's OK to keep our 'flat' structure as long as we are conscious that it could cause trouble in the future. |
|
Alright. With that, I guess we can use the design in this document (with the small change above) as the goal for all If something comes up, let's try to put it here. There is enough spam in the issues (of which I'm the culprit in many cases). |
Agree that a Keep in mind that top-level parameters need to be defined somewhere. The most viable place is under your proposed |
@brynpickering What I meant to convey is that maybe in the future our YAML logic is mature enough to describe any kind of LP / MILP problem without hardcoded code for
tl;dr: I think we generally agree. The |
What can be improved?
This feature request will summarize how I think the 'ideal'
pydantic
implementation should look like. The idea is to collate all discussions in issues (#662, #709, #642, #637, #626, #619) and PRs (#717, #712, #704) related to model definition in YAML files into one design and to discuss it in one place.Goals
Principles behind this proposal:
calliope
models are defined post v0.7 release.pydantic
structure.model.init
is impossible).General design
Thanks to PR #719, we now have a single place that outputs the
model_definition
calliope will use to build a model.The idea is to use
pydantic
to validate the structure around this point: afterscenarios:
/overrides:
/templates:
are resolved, and right before large data files are brought in. This means the validated structure only relates to this specific model instance, avoiding overly complicated structures likeoverrides:
,template:
andtemplates:
.The
pydantic
model would look something like this (see here for initial discussion). I detail each component in a subsectionAttrStr
This just ensures strings follow the same REGEX pattern rules previously outlined in our YAML schemas. It has been proved to work in #717.
CalliopeConfig
This reflects the recent update by @brynpickering in #704. It basically contains all our possible model configuration values for the
init
,build
andsolve
stages. You can read more about it in that PR.CalliopeMath
The goal of this section is to get around several limitations of our current approaches, and build upon #639, #642 and #712. Namely:
dimensions
andparameters
in the YAML files, not in YAML schemas.For simplicity, I assume that math files are still separate from the main YAML. There are two important changes, however:
init()
, but they are not combined.config.init
:build()
(order matters).pydantic
then validates each math file, replacingmath_schema.yaml
. The goal is not to validate if the math is correct (that's the job of the backend), just that the structure of a given math file is valid.Tech and Node
The goal of these two is to replace their definitions in
config_schema.yaml
, as well as to add support for descriptive/passthrough data (#709). The idea is to implement the suggestion @brynpickering gave here.Here is an excerpt from his suggestion for
Tech
(something similar could be used forNode
).DataTable
This
pydantic
schema would replacedata_table_schema.yaml
. PR #717 already proved this works hereHere is an excerpt.
Version
v0.7.0
The text was updated successfully, but these errors were encountered: