Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating all base shapes (country_shapes, europe_shape, nuts3_shapes) #1479

Open
wants to merge 42 commits into
base: master
Choose a base branch
from

Conversation

bobbyxng
Copy link
Contributor

@bobbyxng bobbyxng commented Jan 3, 2025

Closes # (if applicable).

Changes proposed in this Pull Request

  • Updating all base shapes (country_shapes, europe_shape, nuts3_shapes)
  • The workflow has been modified to use higher resolution and more harmonised shapes (NUTS3 and OSM administration level 1):
    • build_shapes: Use NUTS3 2021 01M data, nuts3_shapes.geojson now includes all NUTS levels (0/country, 1-3) in preparation for another PR allowing for country-specific settings for regional clustering (based on NUTS and ADM1). Population and GDP p.c. data is now included for all PyPSA-Eur regions
    • retrieve_osm_boundaries: Added to allow for retrieving country shape directly from OpenStreetMap using Overpass turbo. Used only for non 2021 NUTS3 countries (BA, MD, UA, XK). Known shape projects like geoboundaries and gadm each have their own issues (offset country borders, leaving large gaps or providing the wrong ADM level in the files).
    • build_osm_boundaries: Building ADM1 level shapes/boundaries (BA, MD, UA, XK)
    • Data sources updated to JRC ARDECO https://urban.jrc.ec.europa.eu/ardeco/, as the dataset is harmonised with NUTS3 2021, and still includes UK regions, as well as RS, and CH
    • build_gdp_pop_non_nuts3: Previously needed for MD and UA, now included into build_shapes to create standardised data at the very beginning for all four non NUTS3 countries. For this purpose, the cutouts of the datasets GDP_per_capita_PPP_1990_2015_v2.nc and ppp_2019_1km_Aggregated.tif have been updated
    • Population and GDP now refer to the year 2019.

image

Open to dos:

  • Update GDP_per_capita_PPP_1990_2015_v2.nc and ppp_2019_1km_Aggregated.tif in the databundle (--> @fneum)
  • Remove sandbox data and rule after updating databundle
  • Points in checklist
  • Quick comparison of previous model runs vs. new ones. Distributions largely remains the same.

Checklist

  • I tested my contribution locally and it works as intended.
  • Code and workflow changes are sufficiently documented.
  • Changed dependencies are added to envs/environment.yaml.
  • Sources of newly added data are documented in doc/data_sources.rst.
  • A release note doc/release_notes.rst is added.

Validation/Comparison

Country level

Note that their can be small differences due to regional borders "moving" due to the switch from natural earth country borders to Eurostat NUTS3 01M (and OSM)
image

Nodal load distribution

For country plots see: #1479 (comment)
Explanation for changes:

  • Changes can be due to NUTS borders shifting, changes in GDP p.c. and/or population from 2014 to 2019.
  • Further, countries like AL, BA, RS and XK where previously only assigned a load on a national level.
  • For UA and MD, load is now determined in build_shapes already and mapped at ADM1 level (as opposed to onshore regions level in build_gdp_pop_non_nuts3 previously (the latter having a "pseudo" higher resolution: While the number of regions was higher, the underlying dataset for pop and gdp was at lower resolution),
    image

Biomass distribution (for 128 buses)

image

image

image

image

@bobbyxng bobbyxng requested a review from fneum January 3, 2025 16:05
@bobbyxng bobbyxng self-assigned this Jan 3, 2025
@bobbyxng bobbyxng requested a review from Irieo as a code owner January 3, 2025 16:05
Copy link
Contributor

github-actions bot commented Jan 3, 2025

Validator Report

I am the Validator. Download all artifacts here.
I'll be back and edit this comment for each new commit.

❗ Run failed!

Download 'logs' artifact to see more details.

  • master failed in: no_logs_found
  • shapes failed in: no_logs_found

Model Metrics

Benchmarks Image not available Image not available Image not available

Comparing shapes (a8f9a1b) with master (b6b18ad).
Branch is 42 commits ahead and 0 commits behind.
Last updated on 2025-01-17 14:55:42 CET.

@bobbyxng
Copy link
Contributor Author

@lkstrp @finozzifa I removed the test_country_cover part from unit_testing (#1466), as in this PR, countries() does not exist anymore. The proposed workflow around building shapes is now based entirely around NUTS3 shapes (grouped to different levels to allow for regional clustering in another PR later).

@fneum fneum added this to the v0.14.0 milestone Jan 15, 2025
@bobbyxng
Copy link
Contributor Author

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

image

@bobbyxng
Copy link
Contributor Author

@fneum @Irieo : Done with the comparisons, as a remaining todo, the data bundle for GDP_per_capita_PPP_1990_2015_v2.nc and ppp_2019_1km_Aggregated.tif need to be updated (see sandbox url) and the sandbox code needs to be removed/adapted accordingly.

@FabianHofmann
Copy link
Contributor

Impressive work! About the unit tests, do you think it would make sense to add new ones for the new functions? Perhaps our unit test master @finozzifa could help out?
(For background, I would love to come to the point where we can quickly detect deprecation and ease bug fixing; it would make the overall code base more stable)

@bobbyxng
Copy link
Contributor Author

Impressive work! About the unit tests, do you think it would make sense to add new ones for the new functions? Perhaps our unit test master @finozzifa could help out? (For background, I would love to come to the point where we can quickly detect deprecation and ease bug fixing; it would make the overall code base more stable)

Thanks @FabianHofmann ! Some of the functions are quite specific to the dataset. In this case, it would only make sense if we further break down the functions. Then again we would need to generate some dummy data for testing purposes. While I am generally supporting any move towards detecting deprecation and bugs quickly, I am not sure if this specific case is a good example of showing the benefits. Open to discussing this further :)

@finozzifa
Copy link
Collaborator

hey @FabianHofmann and @bobbyxng,

I believe that we can unit test functions even if we do not re-factor them immediately. In other words, unit testing and code re-factoring can happen at different stages. My feeling is that it is safer to develop new code while writing at the same time unit tests for the new or existing functions (as for example we did at PyPSA/technology-data#160).

I am of course very happy to support :)

@FabianHofmann
Copy link
Contributor

you are right @finozzifa, perhaps we can say it would a very-nice-to-have for this pr, but should not be a blocker. up to @bobbyxng I would say :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants