Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend feature vintage_and_active_years() #572

Merged
merged 31 commits into from
May 25, 2022
Merged

Extend feature vintage_and_active_years() #572

merged 31 commits into from
May 25, 2022

Conversation

OFR-IIASA
Copy link
Contributor

@OFR-IIASA OFR-IIASA commented Mar 24, 2022

This PR extends the features of the function vintage_and_active_years() based on issue #571

How to review

  • Read the diff and note that the CI checks all pass.
  • Build the documentation and look at a description of vintage_and_active_years()
  • Ensure that changes/additions are self-documenting, i.e. that another
    developer (someone like the reviewer) will be able to understand what the code
    does in the future.

PR checklist

  • Continuous integration checks all ✅
  • Add or expand tests; coverage checks both ✅
  • Add, expand, or update documentation.
  • Update release notes.

@OFR-IIASA OFR-IIASA added the enh New features & functionality label Mar 24, 2022
@OFR-IIASA OFR-IIASA linked an issue Mar 24, 2022 that may be closed by this pull request
@OFR-IIASA
Copy link
Contributor Author

@khaeru I have revised the function vintage_and_active_years() to meet the requirements of existing and new tests. Note that the existing tests for this function have been (temporarily?) carried over to test_feature_vintage_and_active_years.py. Hence it is not unexpected for the core tests to fail until the revisions are complete. I would wait for your feedback on the function revision itself before finalizing the clean-up of tests and adding the remaining documentation to the the function.

@adrivinca
Copy link
Contributor

Thanks for this PR, I was actually looking at this issue and almost developing something along the lines of Option 2 mentioned in Issue #571
I wonder if the function that also loads the 'technical_lifetime' also works if no single year is given as input. I do not find it particularly useful to retrieve the vtg-act year combination for a single vtg year (e.g. 2000), I rather need the whole mapping.

but looking at the current code it is not so clear to me if it takes into account the above-mentioned case. So would it be possible to run it with ya_args=("foo", "bar") only?

@OFR-IIASA
Copy link
Contributor Author

OFR-IIASA commented May 11, 2022

@adrivinca yes, the tests if have this use-case as a test.
you can see scen.vintage_and_active_years(ya_args=("foo", "bar")) which will return all vintages and act_years if they are IN the model-horizon (i.e. >= firstmodelyear), scen.vintage_and_active_years(ya_args=("foo", "bar"), in_horizon=False) will provide all vtg and active years irrespective of the firstmodelyear, scen.vintage_and_active_years(ya_args=("foo", "bar"), vtg_lower=2010) will return all vintages/active years AS OF vtg_lower and scen.vintage_and_active_years(ya_args=("foo", "bar"), act_lower=2020) will then filter for all active_years >= act_lower. This covers I hope most usecases.

@adrivinca
Copy link
Contributor

Thanks! I cloned the branch and tried it, it seems ok and the tests as well.
is there any particular review process needed? or could you @OFR-IIASA please rebase the branch and see if tests on Github pass?

@adrivinca
Copy link
Contributor

It seems the tests fail on mac, do you understand why? I don't really

@OFR-IIASA
Copy link
Contributor Author

@adrivinca the reason for these test failing was because this was a test carried out in test_core and based on the original function. The tests have been moved, adjusted and are included in a separate file. I have now removed the obsolete tests from test_core.

@codecov
Copy link

codecov bot commented May 17, 2022

Codecov Report

Merging #572 (0fedf74) into main (b572cce) will increase coverage by 0.1%.
The diff coverage is n/a.

@@           Coverage Diff           @@
##            main    #572     +/-   ##
=======================================
+ Coverage   92.8%   93.0%   +0.1%     
=======================================
  Files         42      43      +1     
  Lines       3287    3375     +88     
=======================================
+ Hits        3052    3139     +87     
- Misses       235     236      +1     
Impacted Files Coverage Δ
message_ix/message_ix/core.py 95.1% <0.0%> (-0.1%) ⬇️
message_ix/message_ix/tests/test_core.py 100.0% <0.0%> (ø)
..._ix/tests/test_feature_vintage_and_active_years.py 100.0% <0.0%> (ø)

@OFR-IIASA
Copy link
Contributor Author

@adrivinca and @khaeru I have made one last adjustment to the PR and added a test for it. Ideally, this function will replicate the behvior of map_tec_lifetime in that the returned activity years for any given vintage wont exceed the maximum technical_lifetime defined for a technology. The documentation has also been updated as have the release notes.

@OFR-IIASA OFR-IIASA requested a review from adrivinca May 18, 2022 12:47
@khaeru
Copy link
Member

khaeru commented May 18, 2022

@adrivinca asked me to take a look at this. I can check more extensively perhaps tomorrow or Friday, but at a skim of the thread I'm concerned about this statement:

the reason for these test failing was because this was a test carried out in test_core and based on the original function. The tests have been moved, adjusted and are included in a separate file. I have now removed the obsolete tests from test_core.

Please read:

  • pandas' deprecation policy, which we essentially follow.
  • Item 7 in the semantic versioning spec: “[Minor version Y (x.Y.z)] MUST be incremented if any public API functionality is marked as deprecated.”
  • …which is linked from our own versioning policy, in which we again restate “A minor version increment may fix bugs or add new features, but does not change existing functionality.”

To expand:

  • A user with code that calls any message_ix function (e.g. vintage_and_active_years()) must experience no surprises when upgrading from a ‘current’ version (e.g. 3.5.0) to the ‘next’ (e.g. 3.6.0 or 4.0.0). That is, their code must not start to produce different results, without prior warning, simply because of an upgrade.
  • If we plan that the (default) behaviour of a function will change, then the next version must warn about the upcoming change, but not change the behaviour.
    • This gives the user a clear signal and time (at least one release cycle) to adjust their code. The release notes must describe what the upcoming change will be, and should give tips on how to handle it.
  • Then, in a subsequent major version (e.g. 4.0.0), the change can become effective/the default, and the warning is removed. (SemVer spec item 8: “Major version X (X.y.z) MUST be incremented if any backwards incompatible changes are introduced to the public API.”)

See some examples in Pandas: https://pandas.pydata.org/pandas-docs/stable/whatsnew/v1.4.0.html#deprecations

Sometimes it takes a little thinking, but it is always possible to introduce new functionality in a non-breaking way; often with an added optional argument. If existing tests fail, that means we are introducing breaking changes, and we can't respond by simply removing the tests.

@OFR-IIASA
Copy link
Contributor Author

@khaeru thanks for the comments. The old tests have been moved from the core to a separate file. If you look at the differences, then the check_dtype=False was added to one check and the tests which looks at "# Exception is raised for incorrect arguments" naturally needs to be changed because the message is changed. All the old tests still exist/pass.

@khaeru
Copy link
Member

khaeru commented May 23, 2022

Okay, I've streamlined the tests here; about 2:45 of work. Some points plus comments inline.

  • When using add_horizon(), it's not necessary to manually set duration_period since Set duration_period in add_horizon()  #286 (August 2020).
  • The general principle is “DRY” (don't repeat yourself):
    A. Some code differs from one test to the next. This represents crucial differences in the invocation and expected outcomes.
    B. A lot of the code does not.
    The more (B) there is, the harder it is to see (A). So it really helps to hide all the repeated bits away in one or a few utility functions—this makes the tests more legible to oneself and other developers.
  • In this case the length of the new file is cut by about half, plus see the comments.

I'll next start to look at the actual implementation and the documentation of the method.

@khaeru
Copy link
Member

khaeru commented May 23, 2022

@OFR-IIASA in 292edf4 I add some tests/examples showing how the functionality of the added act_lower and vtg_lower arguments can be handled using pandas built-ins.

Kind of like make_df() in #415 (“helpful messages pointing users to existing Python/pandas functions that perform identical functions”), I think it is better to teach users how to use pandas more fully than to give ourselves the burden of maintaining/supporting new functionality. Another advantage is that the user doesn't need to memorize a new argument name; they can just .query() using the usual column names.

Is this acceptable? If so, I'll remove the added arguments.

@khaeru
Copy link
Member

khaeru commented May 24, 2022

@OFR-IIASA aside from the persistent dtype issues on Windows, which I will fix, this is complete per our discussion. Please have a look and comment if you agree.

@khaeru khaeru force-pushed the issue/571 branch 2 times, most recently from d90a27d to 4b980e9 Compare May 24, 2022 22:03
@OFR-IIASA
Copy link
Contributor Author

@khaeru thanks for the formidable rework. Everything is as we agreed.

@khaeru khaeru merged commit 9566826 into main May 25, 2022
@khaeru khaeru deleted the issue/571 branch May 25, 2022 09:35
@khaeru khaeru added this to the 3.6 milestone May 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enh New features & functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Return vintage_and_active_years() for a specific technology
3 participants