-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust the processing order of tfm tables #251
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good from a high level, just had some questions. 🙏
@@ -485,6 +485,20 @@ def run(args: argparse.Namespace) -> str | None: | |||
|
|||
model.files.update([Path(path).stem for path in input_files]) | |||
|
|||
processing_order = ["base", "subres", "trade", "demand", "scen", "syssettings"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be defined inside the DataModule class? Just so that if we add another enum value to it, we might be reminded to also add it to this list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a good idea! Would you be up for including this in the refactoring of the class in a subsequent PR or do you think we should change it already?
for item in { | ||
DataModule.module_name(path) | ||
for path in input_files | ||
if DataModule.module_type(path) == data_module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we raise a warning/error if there are any files in input_files
that don't match this if condition for any data_module in processing_order
? I'm wondering if there might be a bug in the future where we add a new item to DataModule
but forget to add it to processing_order
, and then we skip processing some input files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea!
updates.iterrows(), | ||
total=len(updates), | ||
desc=f"Applying transformations from {Tag.tfm_ins_txt.value}", | ||
for data_module in model.data_modules: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't review this loop in detail, because it looked like you just took the old code and moved it under the loop and changed the order in which different tags are processed? Was there any other logical change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it was just that. :-) Hopefully we could process the rows in parallel in one of the next PRs!
This PR adjusts the processing order of tfm tables to bring it closer to Veda and adds
model.data_modules
to enable #240 .Contribute to #41