Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend application of defaults beyond FI_T tables #240

Merged
merged 27 commits into from
Dec 17, 2024
Merged

Conversation

olejandro
Copy link
Member

No description provided.

@olejandro olejandro marked this pull request as ready for review November 14, 2024 12:31
@olejandro
Copy link
Member Author

olejandro commented Nov 16, 2024

@siddharth-krishna any chance you could try this locally, e.g. on Ireland? I am surprised CI shows no difference.

@siddharth-krishna
Copy link
Collaborator

Oh good catch! Looks like there's something wrong with the CI... I'll investigate!

       Benchmark    Time (s)       GDX Diff     Accuracy       Correct    Additional
----------------  ----------  -------------  -----------  ------------  ------------
   DemoS_001-all   4.5   4.8  --         --  100.0 100.0    121    121      3      3
   DemoS_002-all   5.1   5.0  --         --  100.0 100.0    347    347      3      3
   DemoS_003-all   5.4   5.5  --         --  100.0 100.0    639    639      3      3
       DemoS_004   5.5   5.9  --         --  100.0 100.0    674    674      3      3
      DemoS_004a   5.8   5.9  --         --  100.0 100.0    677    677      3      3
      DemoS_004b   5.6   5.7  --         --  100.0 100.0    677    677      3      3
   DemoS_004-all   5.7   5.4  --         --  100.0 100.0    679    679      3      3
   DemoS_005-all   6.5   5.7  --         --   99.7  99.7   1175   1175      6      6
   DemoS_006-all   7.1   7.3  --         --   99.7  99.7   1273   1273      6      6
   DemoS_007-all   7.6   7.8  --         --   99.8  99.8   2171   2171      6      6
DemoS_007-all-1r   6.8   7.0  --         --   99.8  99.8   1195   1195      3      3
   DemoS_008-all   9.6   9.3  --         --   99.9  99.9   5364   5364      6      6
   DemoS_009-all  10.2  10.1  --         --   99.9  99.9   5838   5838     32     32
   DemoS_010-all  10.4  10.6  --         --   88.4  88.4   6170   6170     32     32
   DemoS_011-all  10.3  10.0  --         --   88.1  88.1   6181   6181     32     32
   DemoS_012-all  10.5  10.5  --         --   88.2  88.2   6349   6349     32     32
         Ireland  23.7  20.5  --         --   92.3  92.9  37789  38013   3216   4935
Total runtime: 136.91s (main: 140.23s)
Change in runtime (negative == faster): -3.31s (-2.4%)
Change in correct rows (higher == better): +224 (+0.3%)
Change in additional rows: +1719 (+50.7%)
additional rows regressed on: Ireland

@olejandro
Copy link
Member Author

olejandro commented Nov 24, 2024

#250 should make it easy to reduce the number of additional rows on Ireland.

@olejandro
Copy link
Member Author

olejandro commented Nov 25, 2024

Using DataModule functionality, a lot of additional rows are now eliminated here @siddharth-krishna. :-)
The number of correct rows is reduced as well, which may be due to the order in which the records appear. I'll look into it to see whether to address it as a separate PR or directly here.
Btw, I suspect resolving #246 would have a profound impact on ACT_EFF for Ireland. ;-)

@siddharth-krishna
Copy link
Collaborator

Great news, Olex, regarding DataModule 🚀

I pushed my quick fix for #246, which is to fix a precision of .8g when converting data & ground truth tables to string, prior to conversion. This removes all the additional rows in this PR!

(There are 2 missing rows from COM_PROJ in Demos 9-12 but perhaps that's what you were talking about w.r.t. order of rows?)

Do you have an idea of what a good precision is for outputting / comparing to ground truth?

@olejandro
Copy link
Member Author

Yei!! 🚀

Do you have an idea of what a good precision is for outputting / comparing to ground truth?

Not really... Maybe we could use a trial and error approach to set it as high as possible?

(There are 2 missing rows from COM_PROJ in Demos 9-12 but perhaps that's what you were talking about w.r.t. order of rows?)

Could be! I'll check it out.

@olejandro
Copy link
Member Author

@siddharth-krishna, somehow there seems to be a issue in the gdx comparison. Could you check?

olejandro pushed a commit that referenced this pull request Dec 7, 2024
An attempted fix for #246, which was first noticed in #240. With some
[trial and
error](#240 (comment))
I set a precision of `.10g` (10 significant figures in
exponential/scientific notation) for floating point numbers.

I also had to change the `gdxdiff` options to additionally use a
tolerance in relative difference when comparing the output DD files to
the ground truth, otherwise the higher precision output above increased
the DD diff. We now use `1e-6` for both absolute `Eps` and relative
`RelEps` difference tolerance. If I understand correctly how `gdxdiff`
[works](https://www.gams.com/latest/docs/T_GDXDIFF.html#GDXDIFF_OPTIONS_CRITERIONEXPLANATION),
this feels like a reasonable tolerance:
```
AbsDiff := Abs(V1 - V2);
if  AbsDiff <= EpsAbsolute
then
  Result := true
else
  if EpsRelative > 0.0
  then
     Result := AbsDiff / (1.0 + DMin(Abs(V1), Abs(V2))) <= EpsRelative
  else
     Result := false;
```

Finally, we were reporting the number of lines in the output of
`gdxdiff`, but that's only the number of sets/parameters that are
different, and doesn't track improvements in number of rows different.
I'm now passing the output of `gdxdiff` to `gdxdump` so that the "GDX
Diff" column is more accurate and tracks number of rows in the diff.
@olejandro
Copy link
Member Author

Looks like #251 would push this over the line!

olejandro added a commit that referenced this pull request Dec 14, 2024
This PR adjusts the processing order of tfm tables to bring it closer to Veda and adds `model.data_modules` to enable #240 .
Contribute to #41
@olejandro
Copy link
Member Author

Looks like at least 4 additional records need #255 to be counted as correct. :-)

@olejandro
Copy link
Member Author

@siddharth-krishna, this is now ready for a review!

"module_name",
]
df.dropna(subset="value", inplace=True)
drop_cols = [col for col in df.columns if col != "value" and col not in keep_cols]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep_cols should be a set so that the not in check is fast. We can use list(keep_cols) below if drop_duplicates requires a list, and do an O(n) conversion to a list once, instead of an O(n) list membership check for each column here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -266,6 +267,20 @@ def model_years(self) -> set[int]:
"""
return self.past_years | set(self.time_periods["m"].values)

@cached_property
def veda_cgs(self) -> dict[tuple[str, str, str], str]:
"""veda_cgs is a dictionary mapping commodities to their Veda commodity groups."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""veda_cgs is a dictionary mapping commodities to their Veda commodity groups."""
"""A dictionary mapping commodities to their Veda commodity groups."""

Just double checking: we don't expect this value to change when other properties/fields of the dataclass changes? Maybe we should add a TODO to check when this cached property is invalidated and have a way to clear the cache / recompute this. I'm wondering if that will be useful once we allow people to work with TimesModel objects directly in e.g. Jupyter Notebooks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a good idea!

@olejandro olejandro merged commit 2272b9c into main Dec 17, 2024
2 checks passed
@olejandro olejandro deleted the olex/extend-defaults branch December 17, 2024 18:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants