Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create trade links #167

Merged
merged 15 commits into from
Feb 9, 2024
Merged

Create trade links #167

merged 15 commits into from
Feb 9, 2024

Conversation

olejandro
Copy link
Member

No description provided.

@olejandro
Copy link
Member Author

@siddharth-krishna, any chance you could take a look at the changed added transforms?

raise ValueError(f"TFM_INS-TS table already has Year column: {table}")
# TODO: can we remove this hacky shortcut? Or should it be also applied to the AT variant?
if set(df.columns) & query_columns == {"cset_cn"} and has_no_wildcards(
df["cset_cn"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@olejandro do you understand why we have this condition? Unless there's a compelling performance reason, it's probably better to have one general way of handling all TFM_INS-TS tables.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's handle this in a subsequent PR.

for table in tables:
if table.tag == datatypes.Tag.tradelinks:
df = table.dataframe
comm = df.columns[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure that the first column is always what we want? The danger of this kind of integer indexing is silent failures if someone adds another transform in the middle that changes column orders. Perhaps better to fetch column by name, which will raise an understandable error if that column isn't present.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, yes. We have to rely on the position to retrieve the column name. We could add a check later that the name is on the commodities list.

df = table.dataframe
comm = df.columns[0]
destinations = list(df.columns).remove(comm)
df.rename(columns={comm: "origin"}, inplace=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should try to avoid using the inplace keyword whenever possible (see this proposal for more details).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the commodity is renamed to origin here, but later in the code origin becomes reg1 which sounds like a region? I'm confused.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should address the inplace issue systematically then? Maybe through a subsequent PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the renaming issue, it is due to the table format. What we have is like this:

elc reg1 reg 2
REG1 1
REG2 1

And what we want to get is this:

origin destination comm1 comm 2 comm process tradelink
REG1 REG2 ELC ELC ELC TB_ELC_REG1_REG2_01 b

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@olejandro : Do you know what is the purpose of the third comm? I have always wondered about that...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see! I misread the comm = df.columns[0] as though comm was the entire column, not just the column label. So this essentially turns an adjacency matrix for one type of edge into an adjacency list with edge labels.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@olejandro : Do you know what is the purpose of the third comm? I have always wondered about that...

No... Tbh, I was planning to ask you. :-)

xl2times/transforms.py Outdated Show resolved Hide resolved
model: datatypes.TimesModel,
) -> List[datatypes.EmbeddedXlTable]:
"""
Transform tradelinks table
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Transform tradelinks table
Transform tradelinks table to tradelinks_dins

df.drop(columns=["tradelink"], inplace=True)
df = df.merge(td_type, how="inner", on="regions")
# Drop tradelink (bidirectional) duplicates
df.drop_duplicates(subset=["regions"], keep="last", inplace=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's funny that we drop bidirectional links here but reconstruct them in generate_trade :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's because we want to get rid of the same bidirectional tradelinks that have different process names. Otherwise we would get 4 tradelinks in generate_trade instead of 2. :-)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's only if you put a b in the tradelinks column. :)

I was wondering if it's possible to simplify the code and leave both directions of links here, if it allows removing the code in generate_trade that "doubles" the rows with a b?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, if we don't put b in the column, the links will be processed as unidirectional. :-) This may have an impact on how some other attributes / constraints are applied.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if it's possible to simplify the code and leave both directions of links here, if it allows removing the code in generate_trade that "doubles" the rows with a b?

Not that I see...

td_type = df.groupby(["regions"])["tradelink"].agg("count").reset_index()
td_type.replace({"tradelink": {1: "u", 2: "b"}}, inplace=True)
df.drop(columns=["tradelink"], inplace=True)
df = df.merge(td_type, how="inner", on="regions")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code block above looks good but I do wonder if this sort of thing (melt + count links + create new DF) would actually be faster in plain Python -- it might also be more understandable! It's okay for now I think but I'd be curious to test it sometime. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we have other instances of code blocks like this. Maybe worthwhile investigating!

Co-authored-by: Siddharth Krishna <[email protected]>
@olejandro
Copy link
Member Author

@Antti-L, any chance you could share your opinion on this PR? Namely, currently, the changes don't take into account the sheetname a tradelinks table is found on. This leads to e.g. two unidirectional tradelinks specified in opposite directions in uni sheet to be interpreted as one bidirectional tradelink. Would you find it important to keep those tradelinks unidirectional?

@Antti-L
Copy link

Antti-L commented Feb 8, 2024

Would you find it important to keep those tradelinks unidirectional?

I think it is important if one wants that with xl2times one should be able to reproduce a model that has been built under VEDA. I think that was indeed originally the idea, even though options with more flexibility / better functionality could be also supported by xl2times. But if you would treat two unidirectional links as a single bi-directional link, the model would be different, and any process parameters defined for either would become erroneous, no? The model solution would most likely be different and possibly notably incorrect.

BTW: In trade link matrices one should also be able to specify the name of the process in each cell having a link. I am using that heavily in my models. Is that supported already by Xl2times?

@Antti-L
Copy link

Antti-L commented Feb 8, 2024

I just happened to see this statement in the code (part "Do not create trade links for not included regions"):
filter_regions = model.internal_regions.union({"IMPEXP", "MINRNW"})
If this prevents creating trade links with any other external regions, that is incorrect. Only links connecting one external region to another external region may well be excluded.

@olejandro
Copy link
Member Author

I just happened to see this statement in the code (part "Do not create trade links for not included regions"):
filter_regions = model.internal_regions.union({"IMPEXP", "MINRNW"})
If this prevents creating trade links with any other external regions, that is incorrect. Only links connecting one external region to another external region may well be excluded.

Thanks @Antti-L! This does not yet take into account #162. Only when regions are specifically filtered (e.g. similar to specifying regions to include in Veda case manager).

@olejandro
Copy link
Member Author

olejandro commented Feb 8, 2024

BTW: In trade link matrices one should also be able to specify the name of the process in each cell having a link. I am using that heavily in my models. Is that supported already by Xl2times?

Not yet, but it definitely could be. I would probably open a separate issue to clarify all the details though.

@olejandro
Copy link
Member Author

I think it is important if one wants that with xl2times one should be able to reproduce a model that has been built under VEDA. I think that was indeed originally the idea, even though options with more flexibility / better functionality could be also supported by xl2times. But if you would treat two unidirectional links as a single bi-directional link, the model would be different, and any process parameters defined for either would become erroneous, no? The model solution would most likely be different and possibly notably incorrect.

Among the benchmarks available to us, it is only an issue for TIMES-NZ, but you are right...

@olejandro olejandro marked this pull request as ready for review February 9, 2024 03:47
@olejandro
Copy link
Member Author

This doesn't yet allow generating PRC_ACTUNT and PRC_TSL for trade processes created based on tradelinks.
@Antti-L, I would appreciate another advice from you here:

  • should PRC_TSL default to daynite if commodity of type ELC is in TOP_IRE for the process? Should it remain blank otherwise?
  • for CG in PRC_ACTUNT, should it be determined based on DEM, MAT, NRG, ENV, FIN order?
  • for unit in PRC_ACTUNT, should it be based on the commodity unit?

@Antti-L
Copy link

Antti-L commented Feb 9, 2024

should PRC_TSL default to daynite if commodity of type ELC is in TOP_IRE for the process? Should it remain blank otherwise

I am not sure how it is defined in VEDA, if that's what you ask. I have always hoped for a FI_Process table for the trade processes as well, where I would be able to define the PCG and TSLVL, process subtype, units and vintaging. But no, KanOrs has thus far declined it. I suggest you test it under VEDA to see how VEDA works. Possibly the level is taken as the finest among the commodities traded?

for CG in PRC_ACTUNT, should it be determined based on DEM, MAT, NRG, ENV, FIN order?

Again, I am not sure how it is defined in VEDA, if there are several commodity types traded. But I think you could assume that they are all of the same type (I have never used mixed types in trade processes myself). So you could just take the first commodity type of the traded commodities, or if you like, use some priority order as deemed best or based on VEDA testing.

for unit in PRC_ACTUNT, should it be based on the commodity unit?

I guess that is indeed what VEDA does.

@olejandro olejandro merged commit 8d066d0 into main Feb 9, 2024
1 check passed
@olejandro olejandro deleted the create-trade-links branch February 9, 2024 22:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants