Migrate existing local scenarios to derived datasets #2319

ChaelKruip · 2017-01-10T08:57:31Z

@AlexanderWirtz can you identify and list the important local scenario's in this issue please?

@grdw @ploh please reserve some time in your project to migrate the relevant local scenario's to be based on their own datasets rather than rely on the national one.

AlexanderWirtz · 2017-01-11T12:46:21Z

@ChaelKruip I will do my best to track these down asap.

I am curious as to how this 'migration' will work or what you mean by them. Please have a look at these question below.

How will said migrated scenario refer to the 'dataset'? Will this be a local dataset or just an NL_2013
dataset
Will all input statements that were used to 'tweak' things (all present and both input statements at the very least) be moved elsewhere to work their magic, or will they still be in the original scaled scenario 'user_values'
Will people who base their scenario on such a (publically available) scaled scenario use the same independent dataset?

grdw · 2017-01-11T13:55:37Z

How will said migrated scenario refer to the 'dataset'? Will this be a local dataset or just an NL_2013 dataset.

The area code for those scenarios can hopefully be set to the local dataset; however, this might depend on how custom all these local scenario's have become. If there's a single 'best' 'Groningen' or a single 'best' 'Ameland' than that will be great.

Will all input statements that were used to 'tweak' things (all present and both input statements at the very least) be moved elsewhere to work their magic, or will they still be in the original scaled scenario 'user_values'

Can be both set from inside etsource or through the user_values. Preferably they would be put in the .ad file that accompanies a local dataset.

Will people who base their scenario on such a (publically available) scaled scenario use the same independent dataset?

How do you mean exactly?

ploh · 2017-01-11T15:12:53Z

Will people who base their scenario on such a (publically available) scaled scenario use the same independent dataset?

Yes, they will! From a scenario's perspective a local/derived dataset is not different from a full dataset. Especially, it hast to be created in ETSource (at least for stage 0), i.e. there will not be too many of them and they will not automatically be created when a scenario is created.

ploh · 2017-01-16T15:18:35Z

Thinks to keep in mind:

Beware of %y type inputs: Since they depend on the end year, they must not occur in init. inputs
Technology shares do not work with both: Updat statements for BOTH don't (always?) work with shares and share groups #2324

ploh · 2017-01-26T13:48:53Z

From @AlexanderWirtz:
These are teh scenarios on etengine staging that I would like to have migrated (for Paddepoel):
607975
607980
607984
607505

ploh · 2017-01-26T14:33:08Z

Talked to @ChaelKruip about making the older scaled scenarios (IABR, GEA, Ameland) independent of the nl dataset: We can create a nl2013 dataset. No objections 😄

jorisberkhout · 2017-01-26T14:51:45Z

We can create a nl2013 dataset. No objections 😄

My minor objection to creating an nl2013 dataset is that there is a flaw in the dataflow going from ETDataset to ETSource. This has to do with the fact that each nl dataset on ETDataset (i.e. the one for 2011, 2012 and 2013) contains a file called nl.ad (obviously). Exporting one of the older datasets is not automated. Currently you have to export an older dataset to ETSource, rename the dataset folder to nl20xx, rename the corresponding nl.ad to nl20xx.ad and update the attribute area in this nl20xx.ad to nl20xx. I have forgotten the latter an number of times leading to very frustrating debugging. As discussed this morning, we do not update the energy data of older datasets, but we do change the structure such that it is compatible with changes to the graph.

Long story short, would it be possible to come up with a more robust solution to maintaining older datasets on ETSource? Shooting from the hip I can imagine a structure like this:

datasets
|
-- nl
   |
   -- 2012
   -- 2013
   -- 2014

Where only the latest dataset can be selected from the front-end and older datasets are there to support derived datasets that rely on these older datasets.

What do you think, @ploh , @ChaelKruip , @antw ?

grdw · 2017-01-26T15:42:21Z

What do you think, @ploh , @ChaelKruip , @antw ?

If I can throw in my 2 cents. Can't we solve this with git tags? This will obviously take some changes for ETEngine. I.e. if you select a different start year for NL the correct git tag needs to correspond to the correct packed datafile 🤔 but it might a cool project todo though.

The reason I'm suggesting this is because all the files for an nl dataset will be the same + they'll persist. If the structure of the graph for instance changes than that wouldn't be noticeable in the old nl dataset because the git tag points to the correct commit in the git history.

So you'd have (much like you'd have now):

datasets
-- nl

Except you can do a git checkout tags/nl2012 or something like that.

It might take some people a lesson in advanced git which is a downside to this approach.

antw · 2017-01-26T16:29:44Z

If I can throw in my 2 cents. Can't we solve this with git tags? [...] If the structure of the graph for instance changes than that wouldn't be noticeable in the old nl dataset because the git tag points to the correct commit in the git history.

This sounds confusing to me. If the structure of the graph were to change, what would the workflow look like to update the NL2012 dataset?

grdw · 2017-01-26T16:46:33Z

This sounds confusing to me. If the structure of the graph were to change, what would the workflow look like to update the NL2012 dataset?

Good question. Isn't it now an issue that the graph 'does' change? If it should change than it's going to be annoying. Not impossible though, but annoying (checking out branch with tag, updating graph, moving tag.. ).

In Joris's setup you don't have that problem.

antw · 2017-01-27T13:26:59Z

Shooting from the hip I can imagine a structure like this:
datasets
-- nl
   -- 2012
   -- 2013
   -- 2014

I quite like this, provided it is applied consistently to all datasets. i.e. the directory structure for datasets becomes: :dataset_key/:analysis_year. I imagine it is fairly easy to support this in the VBA scripts?

datasets
├ de
│ └ 2013
├ nl
│ ├ 2012
│ ├ 2013
│ └ 2014
└ uk
  └ 2013

In an ideal world, ETEngine's API would not differentiate between "nl" and "nl2012", but would instead take an area code an optional start/analysis year, and would map that to the correct dataset in the backend.

jorisberkhout · 2017-01-27T13:42:25Z

I quite like this, provided it is applied consistently to all datasets. i.e. the directory structure for datasets becomes: :dataset_key/:analysis_year. I imagine it is fairly easy to support this in the VBA scripts?

I think no changes to the VBA scripts are required. On ETDataset, the very same directory structure already exists. The only thing that needs to be changed is the rake import task, which currently only exports those defined in datasets.yml

In an ideal world, ETEngine's API would not differentiate between "nl" and "nl2012", but would instead take an area code an optional start/analysis year, and would map that to the correct dataset in the backend.

Love it! All in favour of this.

grdw · 2017-01-27T14:23:15Z

[..] provided it is applied consistently to all datasets.

I have one question; what if for example the onshore_suitable_for_wind is going to change for the nl dataset for all start years? Would you than need to update all the .ad (so 2012/nl.ad, 2013/nl.ad, etc.) files individually? Not that that is of my concern at all, but I'm just wondering how that would be achieved. Would that be a VBA script from an Excel just replacing all those values in all the loose .ad files in each start year folder?

If that is the case than this sounds fine to me. 👍

jorisberkhout · 2017-01-27T14:54:45Z

I have one question; what if for example the onshore_suitable_for_wind is going to change for the nl dataset for all start years? Would you than need to update all the .ad (so 2012/nl.ad, 2013/nl.ad, etc.) files individually?

This would never happen. As said before, we never update any data for old datasets (onshore_suitable_for_wind is data), but only maintain them to be in line with changes to the graph. To do this ETDataset, or more specifically, the analysis_manager.xlsm has some nice features to make the users life easier.

ploh · 2017-02-01T11:30:58Z

Note to self:

Remember the problems figured out when trial-migrating Gea

ChaelKruip · 2017-02-02T08:24:37Z

@ploh what is the status here?

ploh · 2017-02-02T08:44:47Z

@ploh what is the status here?

Yesterday, I tried re-running my Gea trial-migration from last week - now that we have made the initializer inputs separate from the normal inputs. I ran into some trouble with non-existing init. inputs. I will try the same for a Paddepoel scenario next and see how hard it is to fix the potential problems there.

ploh · 2017-02-04T09:58:39Z

Update: Even before trying to migrate user_values to init. inputs, I encountered problems when trying to base Paddepoel scenario 607984 on a new derived dataset (without ETE scenario scaling) instead of on nl (with ETE scenario scaling): #2343

ploh · 2017-02-04T19:39:28Z

Like I described in #2343 (comment), there are some issues that should probably be addressed before the Paddepoel migration is continued.

@AlexanderWirtz @grdw @ChaelKruip Alternatively, you could just try to execute the migration rake task and inside of it tinker around with the user_values / init. inputs manually (as described in #2343 (comment)) until you think that the remaining gquery differences are small enough.

I am handing this over to you, now. But I will gladly answer questions and explain my findings or the migration rake task in more detail.

AlexanderWirtz · 2017-06-15T08:04:15Z

THis issue is now purely technical and has moved beyond its original scope. I cannot tell if it should remain open. unassigning myself

ChaelKruip assigned AlexanderWirtz, ploh, ChaelKruip and grdw Jan 10, 2017

grdw mentioned this issue Jan 11, 2017

Overview issue local datasets quintel/atlas#78

Closed

ploh changed the title ~~Make existing local scenario's independent of national dataset~~ Migrate existing local scenario's to derived datasets Jan 16, 2017

ploh changed the title ~~Migrate existing local scenario's to derived datasets~~ Migrate existing local scenarios to derived datasets Jan 16, 2017

This was referenced Feb 1, 2017

What needs to happen with the old %y update period initializer inputs in the adjust scaling folder. quintel/etsource#1229

Closed

Better support for old datasets quintel/etsource#1231

Closed

ploh mentioned this issue Feb 1, 2017

Derived datasets: Bug in migration of "present" user_values quintel/etengine#904

Closed

ploh mentioned this issue Feb 3, 2017

Possible bug in new local datasets scaling / persistence #2343

Closed

ploh removed their assignment Feb 4, 2017

AlexanderWirtz removed their assignment Jun 15, 2017

ChaelKruip closed this as completed Jun 15, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate existing local scenarios to derived datasets #2319

Migrate existing local scenarios to derived datasets #2319

ChaelKruip commented Jan 10, 2017

AlexanderWirtz commented Jan 11, 2017

grdw commented Jan 11, 2017

ploh commented Jan 11, 2017

ploh commented Jan 16, 2017 •

edited

Loading

ploh commented Jan 26, 2017

ploh commented Jan 26, 2017

jorisberkhout commented Jan 26, 2017 •

edited

Loading

grdw commented Jan 26, 2017

antw commented Jan 26, 2017

grdw commented Jan 26, 2017

antw commented Jan 27, 2017

jorisberkhout commented Jan 27, 2017

grdw commented Jan 27, 2017 •

edited

Loading

jorisberkhout commented Jan 27, 2017

ploh commented Feb 1, 2017

ChaelKruip commented Feb 2, 2017

ploh commented Feb 2, 2017

ploh commented Feb 4, 2017

ploh commented Feb 4, 2017

AlexanderWirtz commented Jun 15, 2017

Migrate existing local scenarios to derived datasets #2319

Migrate existing local scenarios to derived datasets #2319

Comments

ChaelKruip commented Jan 10, 2017

AlexanderWirtz commented Jan 11, 2017

grdw commented Jan 11, 2017

ploh commented Jan 11, 2017

ploh commented Jan 16, 2017 • edited Loading

ploh commented Jan 26, 2017

ploh commented Jan 26, 2017

jorisberkhout commented Jan 26, 2017 • edited Loading

grdw commented Jan 26, 2017

antw commented Jan 26, 2017

grdw commented Jan 26, 2017

antw commented Jan 27, 2017

jorisberkhout commented Jan 27, 2017

grdw commented Jan 27, 2017 • edited Loading

jorisberkhout commented Jan 27, 2017

ploh commented Feb 1, 2017

ChaelKruip commented Feb 2, 2017

ploh commented Feb 2, 2017

ploh commented Feb 4, 2017

ploh commented Feb 4, 2017

AlexanderWirtz commented Jun 15, 2017

ploh commented Jan 16, 2017 •

edited

Loading

jorisberkhout commented Jan 26, 2017 •

edited

Loading

grdw commented Jan 27, 2017 •

edited

Loading