Update the nexus module #106

awais307 · 2023-06-19T09:22:28Z

This PR addresses issue#93 and updates data to the nexus module, addresses the tests issue

The code has been restructured to be more test-friendly

two tests has been marked with @pytest.mark.xfail(reason="Temporary, for #106"), namely
test_report_full in test_report.py
and test_build in test_build.py
as they both require a fully parametrized scenario as a starting point, which is not the case for the "bare" scenario for tests.

Since we are about to do major revision and update to the module, we decided to postpone investing further time in those tests.

all tests seem to pass for py3.12, not for py3.11

How to review

PR checklist

Continuous integration checks all ✅
Add or expand tests; coverage checks both ✅
Add, expand, or update documentation.
Update doc/whatsnew.

glatterf42 · 2023-06-19T10:10:39Z

Currently, the linting test is failing (among others). Please see our guide for contributing code to see what code style we expect. Please also note that black, flake8, mypy, and isort can all be configured to be applied automatically whenever you save your code (the details of which can vary, but we can work it out together if you want to). This helps passing theses tests tremendously.

glatterf42 · 2023-06-19T10:13:38Z

The current issue with the other tests seems to be that they still rely on code in message_data (which they shouldn't, as explained before, since all code here should be open source):

message_ix_models/tests/model/water/test_nexus.py:5: in
from message_data.model.water import build
E ModuleNotFoundError: No module named 'message_data'

glatterf42 · 2023-06-19T10:17:12Z

Also to make sure: how large are these data files that you are updating here? If they are too large (and there are lots of them), it might be better to upload them to Zenodo or so (see the second point in the comment by @khaeru).

khaeru · 2023-06-21T08:45:58Z

Changed to "draft" status; prefer this over putting "[WIP] " in the title, as it means the same thing.
I would recommend to update the branch by "Update with rebase" and then git pull --rebase on your local.

glatterf42 · 2023-09-22T08:52:41Z

The tests are mostly working, despite what it looks like here. There are some general issues however that we need to address to fully see their potential:

The most generic problems arise from the Code quality check. This tells us that the code is not formatted correctly and that there are some issues like unsorted import blocks and unused variables. As mentioned before, tools like ruff (superseding isort and flake8) and black can be set up to automatically run whenever you save a file, eliminating most of these problems.
Some of the Code quality problems are not as easy to resolve. Things like unused variables need manual attention, even though they are simple: if you truly don't need the return value of a function, you can use an underscore: _ = some_function(). The mypy errors can be more tedious to resolve as we need to understand why the mentioned modules don't have the missing attributes and whether they should have or where else they should get them from.
As mentioned before, the tests only contain a limited number of problems:

ERROR message_ix_models/tests/model/water/test_cooling.py
ERROR message_ix_models/tests/model/water/test_utils.py
FAILED message_ix_models/tests/model/water/test_irrigation.py::test_add_irr_structure - AttributeError: 'dict' object has no attribute 'regions'
FAILED message_ix_models/tests/model/water/test_water_supply.py::test_map_basin_region_wat - AttributeError: 'dict' object has no attribute 'time'
FAILED message_ix_models/tests/model/water/test_water_supply.py::test_add_water_supply - ModuleNotFoundError: No module named 'your_module'
FAILED message_ix_models/tests/model/water/test_water_supply.py::test_add_e_flow - ModuleNotFoundError: No module named 'your_module'

For the first two FAILED, the issue is the same: the actual function is using the context object incorrectly:

    info = context["water build info"]

    if "year" in context.time:

Are the lines from the actual code. I don't know if this should work, but it's treating context in two different ways: the first reminds me of a dictionary, the second of class. In your tests, you set up a mock Context object (which would normally be a class) as a dictionary, which is why the info = ... line works, but the second one complains: you cannot access dictionary data that way. Since I doubt you want to rewrite your actual code function to use both dict and Context for the context argument, I suggest to study how other tests in this repo solve this problem. You should be able to use a test_context out of the box which you can then adapt.

The last two FAILED result from trying to call a module called your_module, which doesn't exist. You probably want to replace your_module with message_ix_model.model.water.data in both cases.
Finally, both ERRORs are due to missing imports. Both of these issues can probably be solved by addressing all mypy complaints, although one of them is trying to import message_data. This should not happen in message-ix-models and might be a sign that the clean-up process is still incomplete.

adrivinca · 2023-11-23T07:51:09Z

@awais307 can you please update on what your commits did, if anything is still missing and how you expect others to review? thanks

glatterf42 · 2023-11-29T13:44:25Z

@awais307, @adrivinca: the tests are almost working, you're on a good path there! Almost all errors seem to originate from the same problem: that you are not using a 'real' Context object. As pointed out above, using those is preferable since it makes the tests easier to understand and maintain and since you won't have to rewrite your actual code to deal with both Contexts and dicts.
With the latest commit, I adapted the test in tests/model/water/test_cooling.py to show you what I mean by that. We do have an object called test_context and we can use it just like that our test functions. Please take a look and try to adapt your other tests in a similar way, I suspect it will solve most of the remaining errors :)
If you don't want to write the same setup of the text_context all the time, you could introduce another fixture like it e.g. in the tests/model/water/__init__.py: you can define a function that takes the test_context, does whatever setup all your tests frequently share, and return the modified test_context. You can then use the pytest decorator @pytest.fixture(scope="function") to make this function work identical to test_context. Then, you can use it in all your other tests as well :)

One test error might remain, though, and that is this one:

___________________________________________ test_map_yv_ya_lt ___________________________________________

    def test_map_yv_ya_lt():
        periods = (2010, 2020, 2030, 2040)
        lt = 20
        ya = 2020
    
        expected = pd.DataFrame(
            {"year_vtg": [2010, 2020, 2020, 2030], "year_act": [2020, 2020, 2030, 2040]}
        )
    
        result = map_yv_ya_lt(periods, lt, ya)
    
>       pd.testing.assert_frame_equal(result, expected)
E       AssertionError: DataFrame are different
E       
E       DataFrame shape mismatch
E       [left]:  (8, 2)
E       [right]: (4, 2)

message_ix_models/tests/model/water/test_utils.py:124: AssertionError

The shape of the compared DataFrames won't be fixed by the different context, I assume. Please check that expected is really what we expect, e.g. year_vtg contains 2020 twice, is that on purpose?

One final note: I removed a few markers that told rust to ignore functions being too complex. Please take another look at those and try to reduce their complexity! In most cases, it is enough to take a part of the function that includes a lot of if ... elif ... else ... and move that to a new function that takes some input data and returns different output depending on the condition.

Please let me know if you need help with any of the above :)

glatterf42 · 2024-02-27T13:44:54Z

Hi @adrivinca and @awais307, I've taken another look at this PR. I started by addressing all code quality issues: applying some formatting and reducing code complexity. The latter worked quite well in parts because you already had all the puzzle pieces present: e.g. in water_for_ppl.py, your immensely complex cool_tech() function was in part only complex because you defined various helper functions inside that same function. So by moving the definitions outside, the code quickly became less complex.
Unfortunately, the same approach did not work as well for infrastructure.py. While moving stuff outside the main function also resolved the issue, there's still much room for improvements. Almost all steps in the main function that work with the input DataFrame seem to be very similar except for tiny variations, so according to the DRY principle (Don't Repeat Yourself), this code is most likely generalisable.

Next, I took a look at the failing tests. For some of them, I was able to fix them myself, though even in those cases, there are some open questions. Please check e.g. the TODOs in test_map_yv_ya_lt().
For some cases, trying to use the test_context or otherwise fixing the tests led to problems to such a degree that I didn't know how to fix them. Please search for FIXME to find descriptions of these errors with probable causes and determine how you want to fix them.

Finally, your reporting.py contains these lines:

message-ix-models/message_ix_models/model/water/reporting.py

Lines 14 to 36 in abcd614

    
           try: 
        
               from message_data.tools.post_processing.iamc_report_hackathon import ( 
        
                   report as legacy_reporting, 
        
               ) 
        
           except ImportError:  # message_data not installed 
        
               legacy_reporting = None 
        
           log = logging.getLogger(__name__) 
        
           def run_old_reporting(sc=False): 
        
               mp2 = sc.platform 
        
               log.info( 
        
                   " Start reporting of the global energy system (old reporting scheme)" 
        
                   f"for the scenario {sc.model}.{sc.scenario}" 
        
               ) 
        
               legacy_reporting( 
        
                   mp=mp2, 
        
                   scen=sc, 
        
                   merge_hist=True, 
        
                   merge_ts=False, 
        
               )

run_old_reporting() is part of your main workflow (via report_full()), so legacy_reporting is actually required to be there. My version of mypy is already marking this as an error because legacy_reporting really cannot be None, so your code still requires message_data to be present, although no code in this repo should. Please find a solution for that, e.g. by not using the legacy_reporting at all and instead using the reporting system that message_ix_models provides. At the very least, you should add a clause/flag to report_full so that run_old_reporting() is only run if message_data is present. For this, you could reuse the HAS_MESSAGE_DATA flag that already exists in the code:

message-ix-models/message_ix_models/util/common.py

Line 21 in abcd614

HAS_MESSAGE_DATA = True

adrivinca · 2024-03-18T15:59:26Z

Thanks a lot @glatterf42 for spending time on this and for the thorough overview.
I am going through the tests and the code, here the main changes I did

Taking from another commit I did in the leap-re-wks branch, I replaced all the .append() with pd.concat([]) and also another couple of parts where the code was obsolete
concerning test_map_yv the test and functions work as intended. I just don't know why, when running locally the reset_index() in the main function does not seem to work when called from another module.
I used if HAS_MESSAGE_DATA: in the reporting.py script to use the legacy reporting. Not sure the the reporting features in message_ix_models are outputting the same as legacy

Some FIXME are related to making a more general configuration for the module, and also maybe a testing config.
I would like to work on it for the SSP update, where I need anyway to change the code and for instance add the SSP config.
Therefore, I left some FIXME for the future, and I would merge them with the FIXME memo for now
I refer to the comments in test_irrigation, test_water_supply.

Other points are also open questions to me and I would like @awais307 to please intervene and fix.
in test_water_supply:

test_add_e_flow function does not make sense to me, how is the function supposed to work with fictitious regions if the files that get loaded are data files?

currently, the function is checking to match this df

   Region  value  year  time  Unnamed: 0  BCU_name
0  test_Region      1  2020  year           0  test_BCU

and df_sw in the function read_water_availability() before it breaks

     Region     years        value
0  test_BCU    Region  test_Region
1  test_BCU     value            1
2  test_BCU      year         2020
3  test_BCU      time         year
4  test_BCU  BCU_name     test_BCU

I think these functions need to be tested with actual existing data

similar for test_add_water_supply(), please look at it, because it breaks, ut not sure if it is supposed to work with the input data that you give
same for test_map_basin_region_wat()

unfortunately I am not familiar with the with patch and mocking approach @awais307 have you tried those tests you wrote?
When running pytest in the folder the al break
please also have a look to the annotation issue in test_utils.test_add_commodity_and_level(). and test_utils.test_read_config

glatterf42 · 2024-04-02T05:53:47Z

@awais307 I rebased this PR on top of the latest main branch to keep the linear commit history and used this opportunity to make two functions less complex. You should now be able to use the latest commit here, make changes and merge to main without 'out-of-date' issues; please let me know if you can't.

glatterf42 · 2024-04-03T06:41:08Z

@awais307, I'm trying to keep the commit history linear by using git rebase main. A linear history means it will be very easy to understand how and why which changes were introduced and the changes will integrate with the rest of the code base neatly. So if you see that this branch is out-of-date with the main branch, I'd appreciate if you could also use git rebase instead of git merge :)
If you don't feel comfortable using git rebase, please check out e.g. this tutorial.

awais307 · 2024-04-07T19:00:14Z

@awais307, I'm trying to keep the commit history linear by using git rebase main. A linear history means it will be very easy to understand how and why which changes were introduced and the changes will integrate with the rest of the code base neatly. So if you see that this branch is out-of-date with the main branch, I'd appreciate if you could also use git rebase instead of git merge :) If you don't feel comfortable using git rebase, please check out e.g. this tutorial.

okay, for some reason, my local tries to merge it whenever I fetch the changes.

glatterf42 · 2024-04-08T06:34:49Z

@awais307, I'm trying to keep the commit history linear by using git rebase main. A linear history means it will be very easy to understand how and why which changes were introduced and the changes will integrate with the rest of the code base neatly. So if you see that this branch is out-of-date with the main branch, I'd appreciate if you could also use git rebase instead of git merge :) If you don't feel comfortable using git rebase, please check out e.g. this tutorial.

okay, for some reason, my local tries to merge it whenever I fetch the changes.

Could you please try to configure your local git interaction not to do that, then? If you tell me how you interact with git locally, I might be able to help you. Do you use Github Desktop? GitKraken? The command line?

To get the changes from up here to your system and overwrite the changes you have locally, you might need to perform something like a "force pull", which you can do following these instructions. Please let me know if you need any help with that :)

…version

awais307 self-assigned this Jun 19, 2023

awais307 added the water MESSAGEix-Nexus (water) variant label Jun 19, 2023

khaeru marked this pull request as draft June 21, 2023 08:44

khaeru changed the title ~~[WIP] Updates to the nexus module~~ Update the nexus module Jun 21, 2023

awais307 force-pushed the nexus-upd branch from 0b4d7c6 to 8400155 Compare June 23, 2023 18:58

awais307 force-pushed the nexus-upd branch from 8400155 to 1bb96ec Compare July 26, 2023 22:34

adrivinca mentioned this pull request Aug 24, 2023

Updates to the water module, water avaialbility and country sub-annual features #115

Merged

1 task

glatterf42 force-pushed the nexus-upd branch from 43ce2a9 to 21daf50 Compare September 22, 2023 07:34

awais307 marked this pull request as ready for review November 16, 2023 21:42

awais307 requested a review from adrivinca November 16, 2023 21:42

glatterf42 force-pushed the nexus-upd branch from a1877f4 to cc95334 Compare November 29, 2023 07:53

awais307 force-pushed the nexus-upd branch from 9d89c97 to 169e34c Compare January 31, 2024 21:02

glatterf42 force-pushed the nexus-upd branch from 63aad7c to dbc9e6c Compare February 27, 2024 13:18

glatterf42 force-pushed the nexus-upd branch from 32f5405 to 462d569 Compare April 2, 2024 05:52

glatterf42 force-pushed the nexus-upd branch 2 times, most recently from 18b3648 to c3701f5 Compare April 3, 2024 06:36

glatterf42 force-pushed the nexus-upd branch from 8c74fa1 to b97b2c9 Compare April 8, 2024 06:35

glatterf42 and others added 26 commits September 11, 2024 11:35

Adjust for pandas CoW

acf9af5

Use test_context properly

701dd83

Add information for currently failing test

a755cde

Mark minimum version for map_basin_region_wat and all dependent tests

e402916

Test on actual file, not monkeypatch

db124a7

Add superficial tests for infrastructure

35d50de

Add superficial tests for water/reporting

3f17571

Rename water/reporting to water/report for consistency

b3e6c6f

Remove superfluous comments

d1e025c

Use test_context directly

70e8fd8

Migrate water/data tests to appropriate location

c67b4a9

Use same name for test as for code module

d928088

Add type hints, structure docstrings

8627f74

Add superficial tests for water/data/demands

229f530

Fixes for texts in demand and cool_tec

1375083

Cut length of comments

1bfb186

Remove test exclusions for now-tested modules

18b3350

Run add_sectoral_demands only for versions that use a fitting pandas …

49fbcf8

…version

Assign only requested columns

3109995

Update data file generated automatically through tests

693b07d

Fix column names in helper function

51b9362

Address various mypy complaints

ffd44f5

Resolve errors from mypy fixes

2cae567

marking xfail for test_build and report as they required full scenarios

5f69016

Handle KeyError in cooling_fr to avoid minimum version tag

db297cf

Apply minimum version to cool_tec

a8d9136

glatterf42 force-pushed the nexus-upd branch from 806adc2 to a8d9136 Compare September 11, 2024 09:36

Updating whatsnew

5b616f0

adrivinca merged commit ae7b63f into main Sep 12, 2024
28 checks passed

adrivinca deleted the nexus-upd branch September 12, 2024 13:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update the nexus module #106

Update the nexus module #106

awais307 commented Jun 19, 2023 •

edited by adrivinca

Loading

glatterf42 commented Jun 19, 2023

glatterf42 commented Jun 19, 2023

glatterf42 commented Jun 19, 2023

khaeru commented Jun 21, 2023

glatterf42 commented Sep 22, 2023

adrivinca commented Nov 23, 2023

glatterf42 commented Nov 29, 2023

glatterf42 commented Feb 27, 2024

adrivinca commented Mar 18, 2024 •

edited

Loading

glatterf42 commented Apr 2, 2024 •

edited

Loading

glatterf42 commented Apr 3, 2024 •

edited

Loading

awais307 commented Apr 7, 2024

glatterf42 commented Apr 8, 2024

Update the nexus module #106

Update the nexus module #106

Conversation

awais307 commented Jun 19, 2023 • edited by adrivinca Loading

How to review

PR checklist

glatterf42 commented Jun 19, 2023

glatterf42 commented Jun 19, 2023

glatterf42 commented Jun 19, 2023

khaeru commented Jun 21, 2023

glatterf42 commented Sep 22, 2023

adrivinca commented Nov 23, 2023

glatterf42 commented Nov 29, 2023

glatterf42 commented Feb 27, 2024

adrivinca commented Mar 18, 2024 • edited Loading

glatterf42 commented Apr 2, 2024 • edited Loading

glatterf42 commented Apr 3, 2024 • edited Loading

awais307 commented Apr 7, 2024

glatterf42 commented Apr 8, 2024

awais307 commented Jun 19, 2023 •

edited by adrivinca

Loading

adrivinca commented Mar 18, 2024 •

edited

Loading

glatterf42 commented Apr 2, 2024 •

edited

Loading

glatterf42 commented Apr 3, 2024 •

edited

Loading