-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include optimization parameter #94
Conversation
77d9fc8
to
0debc9f
Compare
Let me quickly explain why I raised the minimum pandas version here: since we're using pandas for the upsert-functionality of # Gather all indexsets/columns
index_list = [column.name for column in parameter.columns]
# Convert existing data to DataFrame
existing_data = pd.DataFrame(parameter.data)
# Set the index to be constructed as each "key", i.e. row of indexset-values
# This allows quick comparison of keys for existence and only updates what we actually want to update, i.e. values and units.
# Can't do that for empty dataframes, though (i.e. first addition of data to a parameter)
if not existing_data.empty:
existing_data.set_index(index_list, inplace=True)
# Upsert the new data
parameter.data = (
data.set_index(index_list).combine_first(existing_data).reset_index()
).to_dict(orient="list") I don't see any issue with that code, but pandas < 2.1.0 does: if (Going to make exactly the same commit on #97 and #98, too.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright so this lgtm from a code point of view. I noticed Im starting to have trouble discerning all the boilerplate code from the important stuff.
I hope I can get to some bigger refactoring soon, the tests now take very long to run and its probably smart to encapsulate some stuff/move it to scse-toolkit instead of testing it for each data type (im thinking filters and docs maybe).
Thanks for the review :) I would likely merge this PR then, if you change your review to "approve", as @danielhuppmann doesn't seem to have enough time to review it. No one will immediately start using it, anyway, so I think the most time-efficient version for @danielhuppmann is to review #101 and/or #108, which showcase most of the added syntax. If this is not to our liking, we can still introduce clean-up PRs before merging the tutorials to Finally, @meksor, may I recommend reviewing #97 and #98 soon, too, if time permits? Both PRs are similar to this one and even more so to one another, so doing it soon would keep you from having to chew through all the boilerplate again as you probably still remember the parts that are important. |
* Covers: * run__id, data, name, uniqueness of name together with run__id * Adapts tests since default order of columns changes
* Make Column generic enough for multiple parents * Introduce optimization.Parameter * Add tests for add_data * Enable remaining parameter tests (#86) * Enable remaining parameter tests * Include optimization parameter api layer (#89) * Bump several dependency versions * Let api/column handle both tables and parameters * Make api-layer tests pass * Include optimization parameter core layer (#90) * Enable parameter core layer and test it * Fix things after rebase * Ensure all intended changes survive the rebase * Adapt data validation function for parameters * Allow tests to pass again
acc783f
to
a9e5ac8
Compare
* Include optimization parameter basis (#79) * Fix references to DB filters in docs * Streamline naming in tests * Fix and test parameter list and tabulate for specific runs * Make indexset-creation a test utility * Incorporate changes from #110 * Use pandas for updated add_data behaviour * Raise minimum pandas version to enable add_data upsert * Generalize UsageError for more optimization items * Use generalized UsageError for Table * Use own errors for Parameter
This is the final PR bringing all changes regarding
optimization.Parameter
together. Originally, the goal was to have this PR be very quick because all precursors had been reviewed, but this didn't happen. After merging #82, I had to rework all these branches because for some reason, the propagation of changes upon rebasing did not happen as I expected. During that reworking phase, some errors were introduced and I didn't want to fix these errors individually on all precursor PRs, which I why I collapsed all of them to this one PR, that is now quite large (again), sorry for that.To spare you the trip to the old PRs, let me summarize some discussions here:
DeprecationWarning
arising from the httpx code, it seems. When I discovered it, I thought it should be fixed bystarlette 0.37.2
, which was allowed byfastapi 0.111.0
, but this doesn't seem to be the case. I'm not entirely sure why or how urgent it is to remove this warning.ixmp.__init__
gains some imports, which @danielhuppmann did not like too much. We can potentially remove all except forPlatform
andDatapoint
, but this is probably going into subsequent removal-PRs (see also Clean up top-level namespace #84).Table
andParameter
are very similar, as will beVariable
andEquation
. For every single layer (but @danielhuppmann asked specifically for the Core layer first), these could inherit from each other (or be composed elegantly) to avoid repetition of very similar or identical code. I agree that this is a very useful step and I already have an idea for the Core layer. I would like to implement this pretty much right away, but in a new PR, to keep it manageable.Working on the transport tutorial, I noticed that I had completely forgotten about adding
parameters = db.relationship()
in theRun
DB-table. I have even run migrations for these other PRs, but no error came up. I'm wondering if we even need theserelationship()
declarations at all.