Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address failing nightly workflow #591

Closed
LauWien opened this issue Apr 27, 2022 · 20 comments · Fixed by #613
Closed

Address failing nightly workflow #591

LauWien opened this issue Apr 27, 2022 · 20 comments · Fixed by #613
Labels
ci Continuous integration

Comments

@LauWien
Copy link
Contributor

LauWien commented Apr 27, 2022

As first mentioned by @khaeru in the #message_general slack channel, the nightly workflow is failing possible after merging #561, please see here.
Please find further information under the thread in #message_general here.

Taking the liberty to copy @behnam-zakeri tip to check weather the failing nightly is related to the merged PR:

Before this PR, if there was capacity_factor of zero in the input data, the formulation would convert it to 1, because GAMS couldn't understand if this is a zero or a missing value. Our GAMS formulation makes missing capacity_factor values equal to 1, and still does that. But after this PR, zero values are not considered missing values, and should remain zero. What we can check is that if there is any input capacity_factor equal to zero in this scenario or not. This can be checked like:

df = sc.par("capacity_factor")
assert df.loc[df["value"] == 0].empty
@LauWien LauWien added the ci Continuous integration label Apr 27, 2022
@behnam-zakeri
Copy link
Contributor

I tested this on one of our recent ENGAGE scenarios (ixmp_dev\ENGAGE_SSP2_v4.1.7\baseline) and found interesting results. I ran the scenario (A) without the changes introduced in #561 and (B) once after those changes.

  1. The objective value, which was the cause of the error in nightly tests, remains almost the same between A and B scenarios (3095978 versus 3095977, respectively).
  2. There are indeed differences, as expected, in the parameter capacity_factor calculated in GAMS. So, these may be the cause of nightly tests not passing. These changes are as expected those technologies with capacity_factor=0 in the input file, which will remain the same value in the new formulation, while they were converted to 1 in the old formulation. New formulation is correct here.
  3. Another observation, which is not related to this issue, is that the GAMS formulation in both cases omits some technologies from the final calculated capacity_factor, which do have capacity_factor defined by the user. Examples are "csp_sm1_ppl/AFR". These technologies do not have input, output and relation_activity, and that's why they are not in the set map_tec_time, and get omitted from capacity_factor in this equation. I believe this is a bug and needs a separate issue, as these technologies at the moment don't appear in capacity equations. I will open the issue.
  4. We need to check and perhaps the nightly scenarios. But this PR Resolve the issue of zero capacity factor #561 is doing the right thing for what it was designed.

@LauWien
Copy link
Contributor Author

LauWien commented Apr 27, 2022

Thank you for checking @behnam-zakeri!
As for 1. and 2., if I understand correctly, we can update the objective of the nightly accordingly, since it is expected behavior and we will tackle 3. in a different issue / PR?

@behnam-zakeri
Copy link
Contributor

behnam-zakeri commented Apr 27, 2022

As for 1 and 2, we need to update the scenarios in the nightly tests and then update the values in checks. For 3, I opened this issue: #592

@khaeru
Copy link
Member

khaeru commented Apr 28, 2022

One point of information: the nightly scenarios were exported some time ago (would need to check to see precisely when). So in a way, loading them and running them is a check that newer versions of message_ix can handle scenarios created with older versions of the code: both loading & running without error and producing numerically identical output.

This requirement is not actually super well defined, e.g. I wouldn't expect scenarios created with versions as old as message_ix v1.x or even 2.x to pass (since we have fixed several bugs in the GAMS code since those). So this is also an opportunity to consider more carefully what precisely we expect and maybe adjust the tests to reflect that.

@khaeru
Copy link
Member

khaeru commented Apr 28, 2022

Per Behnam's point (1), this tells us there is something specific about the particular scenario we use for the nightly tests that makes its objective value sensitive to this recent change. This characteristic is not in the other, ENGAGE scenario.

It would be nice to try to identify what this is, so we could warn users: “if your scenarios have such-and-such characteristics, outputs may change” etc. However I also realize that might take quite some detective work that we may not have bandwidth for.

@behnam-zakeri
Copy link
Contributor

Thanks Paul. Good points. I agree that we should test some old scenarios for backward compatibility. Unless we change the code or resolve a bug as this PR does. Then we can update the old check values if the scenario itself is still valid. Especially in this case, as the change of the objective function is significant, I agree that a more detailed investigation can be useful.

@khaeru khaeru mentioned this issue May 5, 2022
4 tasks
khaeru pushed a commit that referenced this issue May 6, 2022
@OFR-IIASA
Copy link
Contributor

As for point 3, the map_tec_time may need to be extended to also include relation_new_capacity and relation_total_capacity. This is where we see entries for the technology "csp_sm1_ppl".

@gidden
Copy link
Member

gidden commented May 10, 2022

I opened #603 as one suggestion of how to deal with these issues in the future. To discuss at our message meeting.

@LauWien
Copy link
Contributor Author

LauWien commented May 10, 2022

I opened #603 as one suggestion of how to deal with these issues in the future. To discuss at our message meeting.

Thank you for this :)
If this PR is still to be discussed / WIP, please could you convert it to a draft and remove me as a reviewer or should I approve this PR to see if the on-approve workflow runs?

@gidden
Copy link
Member

gidden commented May 10, 2022

should I approve this PR to see if the on-approve workflow runs?

That was my thinking, I should have said so explicitly, sorry!

@LauWien
Copy link
Contributor Author

LauWien commented May 10, 2022

It looks like it works (Link).

@LauWien
Copy link
Contributor Author

LauWien commented May 11, 2022

should I approve this PR to see if the on-approve workflow runs?

That was my thinking, I should have said so explicitly, sorry!

Ah not at all :)

@OFR-IIASA
Copy link
Contributor

OFR-IIASA commented May 24, 2022

I have run some tests comparing the latest version and the commit 2045b07
The tests are run for the ENGAGE_SSP2_v4.1.7 baseline and two mitigation cases.
For each scenario, 4 tests are conducted: for each MESSAGEix version, the scenarios are run with and without macro. The objective value and the number of macro iterations are provided.

"baseline":

  • 2034b07 wo macro: 3,095,977.07472038
  • 2034b07 w macro: 3,095,977.07472038 (1)
  • Latest wo macro: 3,095,977.07472038
  • Latest w macro: first iteration solved -> infeasible after 3.5hours and 2nd iteration.

Differences in output gdx (womacro) -> capacity factor entries in LAM for loil_ppl for activity years 1985.
Reporting (womacro) -> no observable differences.

"NPi2020_1000-con-prim-dir-ncr":

  • 2034b07 wo macro: 2,907,578.85136185
  • 2034b07 w macro: 3,469,719.73511341 (11)
  • Latest wo macro: 2,907,578.85136185
  • Latest w macro: 2,672,530.52296623 (15)

"NPi2020_400-con-prim-dir-ncr":

  • 2034b07 wo macro: 5,725,710.5601454
  • 2034b07 w macro: 4,701,228.85651967 (10)
  • Latest wo macro: 5,725,710.5601454
  • Latest w macro: 3,766,013.00718128 (10)

For both mitigation scenarios, with macro, while overall trends are similar between the version 2034b07, there are some significant differences in indicators. For some regions e.g. in 2100 global:

  • industrial-specific demand is 78 EJ (latest-version) instead of 67 EJ.
  • res-com specific demand is 20EJ higher using the latest message_ix version.

@OFR-IIASA
Copy link
Contributor

@khearu and @behnam-zakeri I have noticed that we may have the same issue here as already highlighted in an earlier PR. The issue is that enforce() in models.py is only called when the solving with solve(model="MESSAGE"), it seems to be missing here.

@khaeru
Copy link
Member

khaeru commented May 25, 2022

Good catch. We need to be careful with inheritance of those classes (i.e. the MESSAGE_MACRO Python class needs to do MESSAGE things and MACRO things), and I think this slipped through.

If you copy MESSAGE.enforce to MACRO.enforce (this will not be the solution, only a check) does that help?

@OFR-IIASA
Copy link
Contributor

OFR-IIASA commented May 25, 2022

yes this solves the issue. In the gdx file, is_capacity_factor is now populated. The baseline objective value is the same and converges after 1 iteration. The fact that it wasnt populated, also explains why the gdx-input files showed no differences between and older and a newer version of the MESSAGEix.

@khaeru
Copy link
Member

khaeru commented May 25, 2022

Great, thanks for confirming. The fix should be fairly simple then (ensure the MESSAGE_MACRO class gets the enforce() method, which should be possible by multiple inheritance), but we would need to add a simple test that checks it's so.

@OFR-IIASA
Copy link
Contributor

I have added the quickfix to a test PR and the nightly tests have also passed

@OFR-IIASA
Copy link
Contributor

Upon resolving this issue, I would suggest to make a new mini-release v3.5.1?

@khaeru
Copy link
Member

khaeru commented May 25, 2022

Upon resolving this issue, I would suggest to make a new mini-release v3.5.1?

Agreed we should release quickly to fix the bug.

Ordinarily this would be v3.5.1, yes, but a release will include the changes from #572 that are now on main. That marked some deprecations and—as mentioned at https://github.com/iiasa/message_ix/pull/572#issuecomment-1130400005—the Semantic Versioning specification #7 says “[Minor version Y (x.Y.z)] MUST be incremented if any public API functionality is marked as deprecated.”

Following this logic, the next release will be v3.6.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous integration
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants