Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/transforms speedup #186

Merged
merged 32 commits into from
Feb 15, 2024
Merged

Feature/transforms speedup #186

merged 32 commits into from
Feb 15, 2024

Conversation

SamRWest
Copy link
Collaborator

@SamRWest SamRWest commented Feb 15, 2024

Speeds up two slow parts of the generate_commodity_groups transform by replacing nested for loops with vectorised groupby calls.

Also adds unit tests for the added functions and small test data files (from austimes) for verification.
Pytest runs and coverage reporting are now integrated into CI as well.

CSV-only regression test results show decent overall speedup:

Total runtime: 286.65s (main: 310.07s)
Change in runtime (negative == faster): -23.42s (-7.6%)
Change in correct rows (higher == better): +469 (+0.6%)
Change in additional rows: -12 (-0.0%)

But not sure whether the additional row metric increasing or decreasing is good, can you advise?

Update: full GAMS CI run here didn't show any change to correct/additional rows :)

Total runtime: 293.08s (main: 314.38s)
Change in runtime (negative == faster): -21.30s (-6.8%)
Change in correct rows (higher == better): +0 (+0.0%)
Change in additional rows: +0 (+0.0%)

…d of subprocesses (which use way too much RAM)

Added pyarrow to suppress pandas future warning
skip gams install if no license
…ows' into feature/benchmarks_on_windows

# Conflicts:
#	.gitignore
#	README.md
removed old looped function now vectorised outputs are verified as identical
xl2times/transforms.py Outdated Show resolved Hide resolved
xl2times/transforms.py Outdated Show resolved Hide resolved
xl2times/transforms.py Outdated Show resolved Hide resolved
xl2times/transforms.py Outdated Show resolved Hide resolved
xl2times/transforms.py Outdated Show resolved Hide resolved
@olejandro
Copy link
Member

olejandro commented Feb 15, 2024

Thanks @SamRWest, looks great! I left some minor comments. I guess, we'll need to merge the benchmarks PR first?

@olejandro
Copy link
Member

But not sure whether the additional row metric increasing or decreasing is good, can you advise?

It really depends. :-) Some of them could be additional, because casing is not the same as in the benchmarks - doesn't matter for GAMS; some need extra work - e.g. a dimension is inccorrect; some should have not been generated...

@SamRWest SamRWest enabled auto-merge (squash) February 15, 2024 02:32
@SamRWest
Copy link
Collaborator Author

Thanks @SamRWest, looks great! I left some minor comments. I guess, we'll need to merge the benchmarks PR first?

Yep. Looks like they're both ready to merge now, just need a couple of approvals.
But yes, do #184 first please.

@olejandro
Copy link
Member

@siddharth-krishna please go ahead merging if you don't have any further comments.
It is fun collaborating across the time zones!

Copy link
Collaborator

@siddharth-krishna siddharth-krishna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Go ahead with the merge when ready. :shipit:

README.md Show resolved Hide resolved
utils/run_benchmarks.py Show resolved Hide resolved
fixed bug in git code when no remote is called 'origin'
@SamRWest SamRWest merged commit 9aa7001 into main Feb 15, 2024
1 check passed
@SamRWest SamRWest deleted the feature/transforms_speedup branch February 15, 2024 05:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants