Extend application of defaults beyond FI_T tables #240

olejandro · 2024-11-11T18:19:23Z

No description provided.

olejandro · 2024-11-16T01:35:37Z

@siddharth-krishna any chance you could try this locally, e.g. on Ireland? I am surprised CI shows no difference.

siddharth-krishna · 2024-11-17T10:50:19Z

Oh good catch! Looks like there's something wrong with the CI... I'll investigate!

       Benchmark    Time (s)       GDX Diff     Accuracy       Correct    Additional
----------------  ----------  -------------  -----------  ------------  ------------
   DemoS_001-all   4.5   4.8  --         --  100.0 100.0    121    121      3      3
   DemoS_002-all   5.1   5.0  --         --  100.0 100.0    347    347      3      3
   DemoS_003-all   5.4   5.5  --         --  100.0 100.0    639    639      3      3
       DemoS_004   5.5   5.9  --         --  100.0 100.0    674    674      3      3
      DemoS_004a   5.8   5.9  --         --  100.0 100.0    677    677      3      3
      DemoS_004b   5.6   5.7  --         --  100.0 100.0    677    677      3      3
   DemoS_004-all   5.7   5.4  --         --  100.0 100.0    679    679      3      3
   DemoS_005-all   6.5   5.7  --         --   99.7  99.7   1175   1175      6      6
   DemoS_006-all   7.1   7.3  --         --   99.7  99.7   1273   1273      6      6
   DemoS_007-all   7.6   7.8  --         --   99.8  99.8   2171   2171      6      6
DemoS_007-all-1r   6.8   7.0  --         --   99.8  99.8   1195   1195      3      3
   DemoS_008-all   9.6   9.3  --         --   99.9  99.9   5364   5364      6      6
   DemoS_009-all  10.2  10.1  --         --   99.9  99.9   5838   5838     32     32
   DemoS_010-all  10.4  10.6  --         --   88.4  88.4   6170   6170     32     32
   DemoS_011-all  10.3  10.0  --         --   88.1  88.1   6181   6181     32     32
   DemoS_012-all  10.5  10.5  --         --   88.2  88.2   6349   6349     32     32
         Ireland  23.7  20.5  --         --   92.3  92.9  37789  38013   3216   4935
Total runtime: 136.91s (main: 140.23s)
Change in runtime (negative == faster): -3.31s (-2.4%)
Change in correct rows (higher == better): +224 (+0.3%)
Change in additional rows: +1719 (+50.7%)
additional rows regressed on: Ireland

olejandro · 2024-11-24T23:01:01Z

#250 should make it easy to reduce the number of additional rows on Ireland.

olejandro · 2024-11-25T17:19:48Z

Using DataModule functionality, a lot of additional rows are now eliminated here @siddharth-krishna. :-)
The number of correct rows is reduced as well, which may be due to the order in which the records appear. I'll look into it to see whether to address it as a separate PR or directly here.
Btw, I suspect resolving #246 would have a profound impact on ACT_EFF for Ireland. ;-)

siddharth-krishna · 2024-11-29T07:15:49Z

Great news, Olex, regarding DataModule 🚀

I pushed my quick fix for #246, which is to fix a precision of .8g when converting data & ground truth tables to string, prior to conversion. This removes all the additional rows in this PR!

(There are 2 missing rows from COM_PROJ in Demos 9-12 but perhaps that's what you were talking about w.r.t. order of rows?)

Do you have an idea of what a good precision is for outputting / comparing to ground truth?

olejandro · 2024-11-29T14:47:52Z

Yei!! 🚀

Do you have an idea of what a good precision is for outputting / comparing to ground truth?

Not really... Maybe we could use a trial and error approach to set it as high as possible?

(There are 2 missing rows from COM_PROJ in Demos 9-12 but perhaps that's what you were talking about w.r.t. order of rows?)

Could be! I'll check it out.

olejandro · 2024-11-29T14:54:27Z

@siddharth-krishna, somehow there seems to be a issue in the gdx comparison. Could you check?

An attempted fix for #246, which was first noticed in #240. With some [trial and error](#240 (comment)) I set a precision of `.10g` (10 significant figures in exponential/scientific notation) for floating point numbers. I also had to change the `gdxdiff` options to additionally use a tolerance in relative difference when comparing the output DD files to the ground truth, otherwise the higher precision output above increased the DD diff. We now use `1e-6` for both absolute `Eps` and relative `RelEps` difference tolerance. If I understand correctly how `gdxdiff` [works](https://www.gams.com/latest/docs/T_GDXDIFF.html#GDXDIFF_OPTIONS_CRITERIONEXPLANATION), this feels like a reasonable tolerance: ``` AbsDiff := Abs(V1 - V2); if AbsDiff <= EpsAbsolute then Result := true else if EpsRelative > 0.0 then Result := AbsDiff / (1.0 + DMin(Abs(V1), Abs(V2))) <= EpsRelative else Result := false; ``` Finally, we were reporting the number of lines in the output of `gdxdiff`, but that's only the number of sets/parameters that are different, and doesn't track improvements in number of rows different. I'm now passing the output of `gdxdiff` to `gdxdump` so that the "GDX Diff" column is more accurate and tracks number of rows in the diff.

olejandro · 2024-12-07T19:56:58Z

Looks like #251 would push this over the line!

This PR adjusts the processing order of tfm tables to bring it closer to Veda and adds `model.data_modules` to enable #240 . Contribute to #41

olejandro · 2024-12-14T09:50:29Z

Looks like at least 4 additional records need #255 to be counted as correct. :-)

olejandro · 2024-12-14T21:57:44Z

@siddharth-krishna, this is now ready for a review!

siddharth-krishna · 2024-12-17T17:45:47Z

xl2times/transforms.py

+        "module_name",
+    ]
+    df.dropna(subset="value", inplace=True)
+    drop_cols = [col for col in df.columns if col != "value" and col not in keep_cols]


keep_cols should be a set so that the not in check is fast. We can use list(keep_cols) below if drop_duplicates requires a list, and do an O(n) conversion to a list once, instead of an O(n) list membership check for each column here.

siddharth-krishna · 2024-12-17T17:46:18Z

xl2times/datatypes.py

@@ -266,6 +267,20 @@ def model_years(self) -> set[int]:
        """
        return self.past_years | set(self.time_periods["m"].values)

+    @cached_property
+    def veda_cgs(self) -> dict[tuple[str, str, str], str]:
+        """veda_cgs is a dictionary mapping commodities to their Veda commodity groups."""


Suggested change

"""veda_cgs is a dictionary mapping commodities to their Veda commodity groups."""

"""A dictionary mapping commodities to their Veda commodity groups."""

Just double checking: we don't expect this value to change when other properties/fields of the dataclass changes? Maybe we should add a TODO to check when this cached property is invalidated and have a way to clear the cache / recompute this. I'm wondering if that will be useful once we allow people to work with TimesModel objects directly in e.g. Jupyter Notebooks?

Sounds like a good idea!

olejandro added 7 commits November 11, 2024 13:17

Move _populate_defaults out of apply_fixups

71b8491

Create fill_defaults_in_transform_tables

b3f5105

Include fill_defaults_in_transform_tables in __main__.py

3eea791

Extend the list of transform tables

eca92eb

Refactor some parts to expose the logic

b4f75e9

Merge branch 'main' into olex/extend-defaults

a5dfea8

Explode other_indexes as well

b0864dc

olejandro marked this pull request as ready for review November 14, 2024 12:31

olejandro requested a review from siddharth-krishna November 14, 2024 12:31

olejandro added 5 commits November 15, 2024 07:08

Merge branch 'main' into olex/extend-defaults

1a20d34

Move veda_cgs calculation to model property

66aece7

Extend _populate_calculated_defaults to tfm tables

aa6dc26

Move things around

ab05a4d

Move things around again

ea6f02f

olejandro added 2 commits November 15, 2024 20:59

Clean-up

1006245

Make minor fixes

b14fbbe

Merge branch 'main' into olex/extend-defaults

64b1240

siddharth-krishna mentioned this pull request Nov 22, 2024

CI: Fix regression test not using main branch code #248

Merged

Merge branch 'main' into olex/extend-defaults

5e9db3e

olejandro added 3 commits November 25, 2024 04:26

Merge branch 'main' into olex/extend-defaults

9829090

Merge branch 'main' into olex/extend-defaults

64e3c25

Eliminate additional rows

59a0eec

Convert to string with a fixed float precision

aea2790

olejandro mentioned this pull request Dec 5, 2024

Adjust the processing order of tfm tables #251

Merged

siddharth-krishna mentioned this pull request Dec 7, 2024

Set output float precision & relative epsilon for DD comparison #252

Merged

Merge branch 'main' into olex/extend-defaults

4ebadce

olejandro added a commit that referenced this pull request Dec 14, 2024

Adjust the processing order of tfm tables (#251)

4323d60

This PR adjusts the processing order of tfm tables to bring it closer to Veda and adds `model.data_modules` to enable #240 . Contribute to #41

olejandro added 2 commits December 14, 2024 09:07

Merge branch 'main' into olex/extend-defaults

7abfa93

Fix naming post merge

ca46410

olejandro added 2 commits December 14, 2024 17:09

Query on limtype

76a30ad

Merge branch 'main' into olex/extend-defaults

321aebd

siddharth-krishna approved these changes Dec 17, 2024

View reviewed changes

olejandro added 2 commits December 17, 2024 13:05

Address review comments

85c82ca

Move around a comment

691359e

olejandro merged commit 2272b9c into main Dec 17, 2024
2 checks passed

olejandro deleted the olex/extend-defaults branch December 17, 2024 18:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend application of defaults beyond FI_T tables #240

Extend application of defaults beyond FI_T tables #240

olejandro commented Nov 11, 2024

olejandro commented Nov 16, 2024 •

edited

Loading

siddharth-krishna commented Nov 17, 2024

olejandro commented Nov 24, 2024 •

edited

Loading

olejandro commented Nov 25, 2024 •

edited

Loading

siddharth-krishna commented Nov 29, 2024

olejandro commented Nov 29, 2024

olejandro commented Nov 29, 2024

olejandro commented Dec 7, 2024

olejandro commented Dec 14, 2024

olejandro commented Dec 14, 2024

siddharth-krishna Dec 17, 2024

olejandro Dec 17, 2024

siddharth-krishna Dec 17, 2024

olejandro Dec 17, 2024

	"""veda_cgs is a dictionary mapping commodities to their Veda commodity groups."""
	"""A dictionary mapping commodities to their Veda commodity groups."""

Extend application of defaults beyond FI_T tables #240

Extend application of defaults beyond FI_T tables #240

Conversation

olejandro commented Nov 11, 2024

olejandro commented Nov 16, 2024 • edited Loading

siddharth-krishna commented Nov 17, 2024

olejandro commented Nov 24, 2024 • edited Loading

olejandro commented Nov 25, 2024 • edited Loading

siddharth-krishna commented Nov 29, 2024

olejandro commented Nov 29, 2024

olejandro commented Nov 29, 2024

olejandro commented Dec 7, 2024

olejandro commented Dec 14, 2024

olejandro commented Dec 14, 2024

siddharth-krishna Dec 17, 2024

Choose a reason for hiding this comment

olejandro Dec 17, 2024

Choose a reason for hiding this comment

siddharth-krishna Dec 17, 2024

Choose a reason for hiding this comment

olejandro Dec 17, 2024

Choose a reason for hiding this comment

olejandro commented Nov 16, 2024 •

edited

Loading

olejandro commented Nov 24, 2024 •

edited

Loading

olejandro commented Nov 25, 2024 •

edited

Loading