Feature/benchmarks on windows #184

SamRWest · 2024-02-14T05:44:22Z

This PR contains changes necessary to run the benchmarks locally on a Windows machine.

Main changes:

Introduced shell=True param to subprocess calls. Without this, the virtualenv was not accessible to the subprocesses, which would fail with, e.g. ModuleNotFound: pandas exceptions.
Python subprocesses in windows are heavyweight and so use a lot more RAM than linux, so I've added a variable in utils that limits the number max_workers param in ProcessPoolExecutor constructors to a small number (used to be the number of cpus). Without this limitation, the pool-inside-a-pool structure of the benchmarks was spawning 32*32 python.exe processes and using all my 64Gb of RAM in a few seconds.
Added a --debug switch to run_benchmarks to run each benchmakr as a function call, allowing use of debuggers/breakpoints
CI now runs CSV-only regression tests if GAMS license isn't in secrets

…d of subprocesses (which use way too much RAM) Added pyarrow to suppress pandas future warning

…nts in them work added traceable exit codes

…we're just missing a GAMS license.

skip gams install if no license

.github/workflows/ci.yml

README.md

pyproject.toml

utils/dd_to_csv.py

SamRWest · 2024-02-14T05:53:25Z

utils/run_benchmarks.py

+        summary = run(parse_args(args))
+
+        # pack the results into a namedtuple pretending to be a return value from a subprocess call (as above).
+        res = namedtuple("stdout", ["stdout", "stderr", "returncode"])(summary, "", 0)


I've added a --debug switch (False by default, so no change unless set) to the run_benchmarks script that calls the functions directly instead of in subprocesses.
This is because I wanted to hit breakpoints inside the benchmarks, which is impossible with a subprocess call.

Great idea! So far I've just been launching the debugger and manually specifying the args to xl2times, which is a pain for the benchmarks that have a list of excel files as arg. This should make it much more seamless!

What happens when we use --debug but not --run? Does it still try to run all the benchmarks in parallel? Would that mess up the debugger? If so, maybe we should enforce --debug implies --run

Yeah, I found commenting out lines in benchmarks.yml was a nicer workflow than running them individually.

What happens when we use --debug but not --run? Does it still try to run all the benchmarks in parallel? Would that mess up the debugger? If so, maybe we should enforce --debug implies --run

As it is now, --debug works with and without --run as it gets passed into both run_all_benchmarks() (without --run) and run_a_benchmark() (with --run).
Am I missing something about how these switches are intended to be used?

As I understand, the way run_benchmarks.py works right now is to run all benchmarks in parallel by default, unless --run is specified, in which case it runs a single benchmark. Without --run, if --debug is specified and the developer enables a breakpoint in the code in say transforms.py, that breakpoint will potentially be hit on every benchmark that is being run in parallel. So on which one does the debugger stop? Will this break the debugger?

I've not tried putting a breakpoint in code that is being run in parallel before. :)

If my understanding is correct, perhaps it is simpler to enforce that if you use --debug you must also use --run?

utils/run_benchmarks.py

SamRWest · 2024-02-14T05:56:50Z

xl2times/__main__.py

+
+def main(arg_list: None | list[str] = None):
+    args = parse_args(arg_list)
+    run(args)


again, this is so I can call a function directly when --debug is set

Nice idea. Why not inline run into main and remove run? The only other place it's being used also calls run(parse_args(...)), which is essentially main(...).

Great question.
I did exactly this initially, but kept getting exitcode=1 (even after renumbering all the explicit exit codes).
Eventually worked out that if main() returns a 'truthy' value (i.e. the string returned by run() in --debug mode now), that gets interpreted as a 1 exit code and the benchmark is seen to have failed, even when successful.
Hiding this exit code inside run() solved this.

But yeah, I agree with your comment below that using subprocess and exit codes is pretty brittle (especially cross-platform) and it'd be better to run everything as function calls and catch exceptions with detailed error messages on failures.

xl2times/utils.py

utils/run_benchmarks.py

siddharth-krishna

Thanks Sam, these are some great improvements to our CI/dev process! I've left a few minor comments/suggestions.

But now, thinking about it again, do we really need subprocess in run_benchmarks? With your changes, we can always call the main functions of dd_to_csv and xl2times directly, and perhaps that eliminates some overhead, even in linux/macOS. What do you think?

.github/workflows/ci.yml

.python-version

README.md

pyproject.toml

siddharth-krishna · 2024-02-14T07:05:53Z

utils/dd_to_csv.py

+
+
+if __name__ == "__main__":
+    main()


Do we not need to pass in an arg_list here? (I haven't seen this idiom before)

Sam, do you need to make this change here too?

Suggested change

main()

main(sys.argv[1:])

siddharth-krishna · 2024-02-14T07:16:18Z

xl2times/__main__.py

+
+def main(arg_list: None | list[str] = None):
+    args = parse_args(arg_list)
+    run(args)


Nice idea. Why not inline run into main and remove run? The only other place it's being used also calls run(parse_args(...)), which is essentially main(...).

siddharth-krishna · 2024-02-14T07:19:42Z

utils/run_benchmarks.py

+        summary = run(parse_args(args))
+
+        # pack the results into a namedtuple pretending to be a return value from a subprocess call (as above).
+        res = namedtuple("stdout", ["stdout", "stderr", "returncode"])(summary, "", 0)


Great idea! So far I've just been launching the debugger and manually specifying the args to xl2times, which is a pain for the benchmarks that have a list of excel files as arg. This should make it much more seamless!

What happens when we use --debug but not --run? Does it still try to run all the benchmarks in parallel? Would that mess up the debugger? If so, maybe we should enforce --debug implies --run

utils/run_benchmarks.py

olejandro · 2024-02-14T14:05:02Z

Great, thanks a lot @SamRWest! So soon I won't have any excuse not to run benchmarks this way on windows. 😃 I've been mostly using batch scripts to run the tool on a benchmark I was working on and than letting CI point out what I was breaking. 😆

SamRWest · 2024-02-14T21:30:38Z

Great, thanks a lot @SamRWest! So soon I won't have any excuse not to run benchmarks this way on windows. 😃 I've been mostly using batch scripts to run the tool on a benchmark I was working on and than letting CI point out what I was breaking. 😆

No worries!
I'm a firm believer that speeding up the feedback loop is always better for productivity.

SamRWest · 2024-02-14T21:34:36Z

Thanks Sam, these are some great improvements to our CI/dev process! I've left a few minor comments/suggestions.

But now, thinking about it again, do we really need subprocess in run_benchmarks? With your changes, we can always call the main functions of dd_to_csv and xl2times directly, and perhaps that eliminates some overhead, even in linux/macOS. What do you think?

Yeah agreed. Subprocesses suck. Whenever I've used them I always spend way too long troubleshooting silly things like cross-OS inconsistencies and character encoding bugs.

Changing the subprocess calls to function calls, and maybe even sticking them into unit tests (so you can run with, say pytest -m benchmarks) would be a lot simpler and make it easier to debug using exception messages rather than exit codes.
Probably one for a future PR though as it'd be a pretty big change to the pipeline.

…ows' into feature/benchmarks_on_windows # Conflicts: # .gitignore # README.md

SamRWest · 2024-02-15T02:30:32Z

I think I've addressed all the comments here now @olejandro. If you're happy, it just needs an approval now.

olejandro · 2024-02-15T02:39:17Z

Issues dealt with in this PR are a bit outside of my area of expertise. :) @siddharth-krishna please feel free to merge if you are happy. I am happy. :)

siddharth-krishna

Just a couple of quick small things, feel free to merge once they have been resolved! Thanks again. 🚀

siddharth-krishna · 2024-02-15T04:27:20Z

pyproject.toml

@@ -3,7 +3,7 @@ requires = ["setuptools>=61.0.0", "wheel"]
 build-backend = "setuptools.build_meta"

 [tool.setuptools]
-packages = ["xl2times"]
+packages = ["xl2times", "utils"]


Would this also install a utils package to the python (virtual)env if someone does pip install . in the root directory? Wondering if that might result in name clashes.

I was thinking of utils more as a collection of scripts for benchmarking/developing, and not as a package to be distributed. Do we need to include it in pyproject.toml? Happy to go with the best practice here.

Ah apologies, this was a fix for a ModuleNotFound from a from utils.dd_to_csv import in run_benchmarks I added, but the right solution was to import dd_to_csv directly.
I'll revert this change in #186

But yes, this would have included utils in built wheels etc. I didn't realise this was meant to not be included.

siddharth-krishna · 2024-02-15T04:35:10Z

utils/dd_to_csv.py

+
+
+if __name__ == "__main__":
+    main()


Sam, do you need to make this change here too?

Suggested change

main()

main(sys.argv[1:])

siddharth-krishna · 2024-02-15T04:38:47Z

utils/run_benchmarks.py

+        summary = run(parse_args(args))
+
+        # pack the results into a namedtuple pretending to be a return value from a subprocess call (as above).
+        res = namedtuple("stdout", ["stdout", "stderr", "returncode"])(summary, "", 0)


As I understand, the way run_benchmarks.py works right now is to run all benchmarks in parallel by default, unless --run is specified, in which case it runs a single benchmark. Without --run, if --debug is specified and the developer enables a breakpoint in the code in say transforms.py, that breakpoint will potentially be hit on every benchmark that is being run in parallel. So on which one does the debugger stop? Will this break the debugger?

I've not tried putting a breakpoint in code that is being run in parallel before. :)

If my understanding is correct, perhaps it is simpler to enforce that if you use --debug you must also use --run?

SamRWest · 2024-02-15T04:48:05Z

Ah crap, it auto-merged once you approved @siddharth-krishna, sorry.
I'll address your comments before merging #186 instead.

siddharth-krishna · 2024-02-15T04:48:54Z

No worries! That sounds good, thanks. :)

SamRWest · 2024-02-15T05:23:27Z

Addressing your last two comments (I can't reply directly now it's been merged):

Sam, do you need to make this change here too?

It's not strictly necessary as argparse will do this for us. But I'll do it anyway so it's more clear what's going on.

Without --run, if --debug is specified and the developer enables a breakpoint in the code in say transforms.py, that breakpoint will potentially be hit on every benchmark that is being run in parallel. So on which one does the debugger stop? Will this break the debugger?

The python debuggers I've used (pycharm, vscode) generally work fine on multiprocess breakpoints. They'll usually just break at the same point in every process (though sometimes just the first hit), which is sometimes useful (if you don't know which one has the bug you're hunting, for example).

Granted this can be confusing if you have all 13 benchmarks enabled and you get stopped at the breakpoint 13 times though..

A potential alternative could be to disable the process-pool benchmark parallelisation in --debug mode. This might be easier to understand, and better suited to the comment-out-stuff-in-benchmarks.yml workflow?

fixed bug in git code when no remote is called 'origin'

siddharth-krishna · 2024-02-15T05:50:40Z

It's not strictly necessary as argparse will do this for us. But I'll do it anyway so it's more clear what's going on.

Cool, didn't know how argparse worked. Sounds good

A potential alternative could be to disable the process-pool benchmark parallelisation in --debug mode. This might be easier to understand, and better suited to the comment-out-stuff-in-benchmarks.yml workflow?

Great idea! I had anyway wanted to include a --no-parallel flag, because right now the runtimes reported aren't super meaningful given that there's so much parallelism and non-determinism involved, so it would be useful to have it run all the benchmarks in series to benchmark perf changes. --debug could be used for this.

SamRWest added 13 commits February 14, 2024 11:47

Modified benchmark run in Windows to use direct function calls instea…

e310134

…d of subprocesses (which use way too much RAM) Added pyarrow to suppress pandas future warning

testing regression code

43af588

limit max_workers on windows to reduce peak ram

57bbb4d

more process count limiting

13f35ee

move max_workers to utils.

1c8e11f

Added more detail to benchmarking section in readme

552af5b

added debug flag for running benchmarks as function calls so breakpoi…

52ae6b5

…nts in them work added traceable exit codes

If benchmark fails in CI, try again without the --dd flag, to see if …

4572d0e

…we're just missing a GAMS license.

Split into separate CI tasks with/without GAMS license

beb05ad

update ci syntax

f24ca91

fixed secret check

319fa1a

add working dir

8daecc4

skip gams install if no license

Pass CI if no-GAMS regression tests pass

1b8b0c9

SamRWest commented Feb 14, 2024

View reviewed changes

utils/run_benchmarks.py Outdated Show resolved Hide resolved

Merge branch 'main' into feature/benchmarks_on_windows

bde5e42

SamRWest requested review from siddharth-krishna and olejandro February 14, 2024 06:04

siddharth-krishna reviewed Feb 14, 2024

View reviewed changes

SamRWest added 2 commits February 15, 2024 08:35

Addressed review comments

dd6fc95

Merge remote-tracking branch 'github-etsap/feature/benchmarks_on_wind…

e5c9b1f

…ows' into feature/benchmarks_on_windows # Conflicts: # .gitignore # README.md

SamRWest mentioned this pull request Feb 15, 2024

Feature/transforms speedup #186

Merged

Merge branch 'main' into feature/benchmarks_on_windows

2f9b27a

SamRWest enabled auto-merge (squash) February 15, 2024 02:25

siddharth-krishna approved these changes Feb 15, 2024

View reviewed changes

SamRWest merged commit 9c36c37 into main Feb 15, 2024
1 check passed

SamRWest deleted the feature/benchmarks_on_windows branch February 15, 2024 04:39

SamRWest added a commit that referenced this pull request Feb 15, 2024

addressed final comments in #184

83ec9c7

fixed bug in git code when no remote is called 'origin'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/benchmarks on windows #184

Feature/benchmarks on windows #184

SamRWest commented Feb 14, 2024 •

edited

Loading

SamRWest Feb 14, 2024

siddharth-krishna Feb 14, 2024

SamRWest Feb 14, 2024 •

edited

Loading

siddharth-krishna Feb 15, 2024

SamRWest Feb 14, 2024

siddharth-krishna Feb 14, 2024

SamRWest Feb 14, 2024

siddharth-krishna left a comment

siddharth-krishna Feb 14, 2024

siddharth-krishna Feb 15, 2024

siddharth-krishna Feb 14, 2024

siddharth-krishna Feb 14, 2024

olejandro commented Feb 14, 2024

SamRWest commented Feb 14, 2024

SamRWest commented Feb 14, 2024

SamRWest commented Feb 15, 2024

olejandro commented Feb 15, 2024

siddharth-krishna left a comment

siddharth-krishna Feb 15, 2024

SamRWest Feb 15, 2024

siddharth-krishna Feb 15, 2024

siddharth-krishna Feb 15, 2024

SamRWest commented Feb 15, 2024

siddharth-krishna commented Feb 15, 2024

SamRWest commented Feb 15, 2024 •

edited

Loading

siddharth-krishna commented Feb 15, 2024

Feature/benchmarks on windows #184

Feature/benchmarks on windows #184

Conversation

SamRWest commented Feb 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SamRWest Feb 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siddharth-krishna left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

olejandro commented Feb 14, 2024

SamRWest commented Feb 14, 2024

SamRWest commented Feb 14, 2024

SamRWest commented Feb 15, 2024

olejandro commented Feb 15, 2024

siddharth-krishna left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SamRWest commented Feb 15, 2024

siddharth-krishna commented Feb 15, 2024

SamRWest commented Feb 15, 2024 • edited Loading

siddharth-krishna commented Feb 15, 2024

SamRWest commented Feb 14, 2024 •

edited

Loading

SamRWest Feb 14, 2024 •

edited

Loading

SamRWest commented Feb 15, 2024 •

edited

Loading