-
Notifications
You must be signed in to change notification settings - Fork 7
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'main' into olex/update-benchmarks
- Loading branch information
Showing
15 changed files
with
513 additions
and
166 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -72,6 +72,45 @@ git commit --no-verify | |
|
||
See our GitHub Actions CI `.github/workflows/ci.yml` and the utility script `utils/run_benchmarks.py` to see how to run the tool on the DemoS models. | ||
|
||
In short, use the commands below to clone the benchmarks data into your local `benchmarks` dir. | ||
Note that this assumes you have access to all these repositories (some are private and | ||
you'll have to request access) - if not, comment out the inaccessible benchmarks from `benchmakrs.yml` before running. | ||
|
||
```bash | ||
mkdir benchmarks | ||
# Get VEDA example models and reference DD files | ||
# XLSX files are in private repo for licensing reasons, please request access or replace with your own licensed VEDA example files. | ||
git clone [email protected]:olejandro/demos-xlsx.git benchmarks/xlsx/ | ||
git clone [email protected]:olejandro/demos-dd.git benchmarks/dd/ | ||
|
||
# Get Ireland model and reference DD files | ||
git clone [email protected]:esma-cgep/tim.git benchmarks/xlsx/Ireland | ||
git clone [email protected]:esma-cgep/tim-gams.git benchmarks/dd/Ireland | ||
``` | ||
Then to run the benchmarks: | ||
```bash | ||
# Run a only a single benchmark by name (see benchmarks.yml for name list) | ||
python utils/run_benchmarks.py benchmarks.yml --verbose --run DemoS_001-all | tee out.txt | ||
|
||
# Run all benchmarks (without GAMS run, just comparing CSV data) | ||
python utils/run_benchmarks.py benchmarks.yml --verbose | tee out.txt | ||
|
||
|
||
# Run benchmarks with regression tests vs main branch | ||
git branch feature/your_new_changes --checkout | ||
# ... make your code changes here ... | ||
git commit -a -m "your commit message" # code must be committed for comparison to `main` branch to run. | ||
python utils/run_benchmarks.py benchmarks.yml --verbose | tee out.txt | ||
``` | ||
At this point, if you haven't broken anything you should see something like: | ||
``` | ||
Change in runtime: +2.97s | ||
Change in correct rows: +0 | ||
Change in additional rows: +0 | ||
No regressions. You're awesome! | ||
``` | ||
If you have a large increase in runtime, a decrease in correct rows or fewer rows being produced, then you've broken something and will need to figure out how to fix it. | ||
|
||
### Debugging Regressions | ||
|
||
If your change is causing regressions on one of the benchmarks, a useful way to debug and find the difference is to run the tool in verbose mode and compare the intermediate tables. For example, if your branch has regressions on Demo 1: | ||
|
@@ -97,6 +136,7 @@ python -m build | |
python -m twine upload dist/* | ||
``` | ||
|
||
|
||
## Contributing | ||
|
||
This project welcomes contributions and suggestions. Most contributions require you to agree to a | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
from datetime import datetime | ||
|
||
import pandas as pd | ||
|
||
from xl2times import transforms | ||
from xl2times.transforms import ( | ||
_process_comm_groups_vectorised, | ||
_count_comm_group_vectorised, | ||
) | ||
|
||
pd.set_option( | ||
"display.max_rows", | ||
20, | ||
"display.max_columns", | ||
20, | ||
"display.width", | ||
300, | ||
"display.max_colwidth", | ||
75, | ||
"display.precision", | ||
3, | ||
) | ||
|
||
|
||
class TestTransforms: | ||
def test_generate_commodity_groups(self): | ||
""" | ||
Tests that the _count_comm_group_vectorised function works as expected. | ||
Full austimes run: | ||
Vectorised version took 0.021999 seconds | ||
looped version took 966.653371 seconds | ||
43958x speedup | ||
""" | ||
# data extracted immediately before the original for loops | ||
comm_groups = pd.read_parquet( | ||
"tests/data/comm_groups_austimes_test_data.parquet" | ||
).drop(columns=["commoditygroup"]) | ||
|
||
# filter data so test runs faster | ||
comm_groups = comm_groups.query("region in ['ACT', 'NSW']") | ||
|
||
comm_groups2 = comm_groups.copy() | ||
_count_comm_group_vectorised(comm_groups2) | ||
assert comm_groups2.drop(columns=["commoditygroup"]).equals(comm_groups) | ||
assert comm_groups2.shape == (comm_groups.shape[0], comm_groups.shape[1] + 1) | ||
|
||
def test_default_pcg_vectorised(self): | ||
"""Tests the default primary commodity group identification logic runs correctly. | ||
Full austimes run: | ||
Looped version took 1107.66 seconds | ||
Vectorised version took 62.85 seconds | ||
""" | ||
|
||
# data extracted immediately before the original for loops | ||
comm_groups = pd.read_parquet("tests/data/austimes_pcg_test_data.parquet") | ||
|
||
comm_groups = comm_groups[(comm_groups["region"].isin(["ACT", "NT"]))] | ||
comm_groups2 = _process_comm_groups_vectorised( | ||
comm_groups.copy(), transforms.csets_ordered_for_pcg | ||
) | ||
assert comm_groups2 is not None and not comm_groups2.empty | ||
assert comm_groups2.shape == (comm_groups.shape[0], comm_groups.shape[1] + 1) | ||
assert comm_groups2.drop(columns=["DefaultVedaPCG"]).equals(comm_groups) | ||
|
||
|
||
if __name__ == "__main__": | ||
TestTransforms().test_default_pcg_vectorised() |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.