-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cached _split_line_selector
to avoid redundant parsing in select_lines
#5237
Conversation
CodSpeed Performance ReportMerging #5237 will improve performances by ×9.8Comparing Summary
Benchmarks breakdown
|
e472295
to
60783ee
Compare
Reworks select_lines into a new cached helper function (_split_line_selector) that returns the parsed lines and selectors eliminating repeat parsing of the same file.
60783ee
to
3a1b5a7
Compare
3a1b5a7
to
adef976
Compare
_split_line_selector
to avoid redundant parsing in select_lines
_split_line_selector
to avoid redundant parsing in select_lines
Have you tested this with the recipe @mbargull was working on? I'm wondering if the hotspots measured here are the same as those when big rerendering happens. |
Here is a timing test on rerendering (cf-dev) beckermr@finnegan ctng-compilers-feedstock % time conda-smithy rerender
INFO:conda_smithy.configure_feedstock:Downloading conda-forge-pinning-2024.03.19.15.37.47
INFO:conda_smithy.configure_feedstock:Extracting conda-forge-pinning to /Users/beckermr/.cache/conda-smithy
INFO:conda_smithy.configure_feedstock:__pycache__ rendering is skipped
INFO:conda_smithy.configure_feedstock:README rendering is skipped
WARNING: Setting build platform. This is only useful when pretending to be on another platform, such as for rendering necessary dependencies on a non-native platform. I trust that you know what you're doing.
WARNING: Setting build arch. This is only useful when pretending to be on another arch, such as for rendering necessary dependencies on a non-native arch. I trust that you know what you're doing.
WARNING: No numpy version specified in conda_build_config.yaml. Falling back to default numpy value of 1.23
Adding in variants from internal_defaults
Adding in variants from /Users/beckermr/.cache/conda-smithy/conda_build_config.yaml
Adding in variants from /Users/beckermr/Desktop/conda-forge/ctng-compilers-feedstock/recipe/conda_build_config.yaml
Adding in variants from argument_variants
INFO:conda_smithy.configure_feedstock:Re-rendered with conda-build 24.1.3.dev39, conda-smithy 3.32.0, and conda-forge-pinning 2024.03.19.15.37.47
INFO:conda_smithy.configure_feedstock:You can commit the changes with:
git commit -m "MNT: Re-rendered with conda-build 24.1.3.dev39, conda-smithy 3.32.0, and conda-forge-pinning 2024.03.19.15.37.47"
INFO:conda_smithy.configure_feedstock:These changes need to be pushed to github!
conda-smithy rerender 475.86s user 79.66s system 88% cpu 10:27.78 total
(cf-dev) beckermr@finnegan ctng-compilers-feedstock % time conda-smithy rerender
INFO:conda_smithy.configure_feedstock:__pycache__ rendering is skipped
INFO:conda_smithy.configure_feedstock:README rendering is skipped
WARNING: Setting build platform. This is only useful when pretending to be on another platform, such as for rendering necessary dependencies on a non-native platform. I trust that you know what you're doing.
WARNING: Setting build arch. This is only useful when pretending to be on another arch, such as for rendering necessary dependencies on a non-native arch. I trust that you know what you're doing.
WARNING: No numpy version specified in conda_build_config.yaml. Falling back to default numpy value of 1.23
Adding in variants from internal_defaults
Adding in variants from /Users/beckermr/.cache/conda-smithy/conda_build_config.yaml
Adding in variants from /Users/beckermr/Desktop/conda-forge/ctng-compilers-feedstock/recipe/conda_build_config.yaml
Adding in variants from argument_variants
INFO:conda_smithy.configure_feedstock:Re-rendered with conda-build 24.1.2, conda-smithy 3.32.0, and conda-forge-pinning 2024.03.19.15.37.47
INFO:conda_smithy.configure_feedstock:You can commit the changes with:
git commit -m "MNT: Re-rendered with conda-build 24.1.2, conda-smithy 3.32.0, and conda-forge-pinning 2024.03.19.15.37.47"
INFO:conda_smithy.configure_feedstock:These changes need to be pushed to github!
conda-smithy rerender 642.32s user 95.69s system 89% cpu 13:42.06 total So the code is about 35% faster with this change on one of our slowest rerendering tasks. That is a lot less than the 10x in the performance tests. |
@kenodegard, thanks for taking a stab on this! @beckermr, thanks for trying this out already! @kenodegard, I'll later try to strip the |
@beckermr @mbargull the new benchmarks results are in: #5237 (comment) |
This reverts commit c89995d.
Signed-off-by: Marcel Bargull <[email protected]>
Description
A continuation of #5233 with inspiration from #5225.
When rendering a recipe we end up invoking
conda_build.metadata.select_lines
a number of times.Within
select_lines
we first iterate over every line and split the line into the line content versus the selector and second we evaluate the selector to determine whether the line should be kept.The first process of iterating over every line and splitting it should (and can) be a one time operation. This is gained by introducing a new
_split_line_selector
function which caches the parsed result for a given text.The second process of evaluating the selectors can also be cached but needs to be cached differently. Instead of caching this globally (since selectors can change between rendering passes) we cache the selectors locally, so a given selector will only be evaluated once for a single pass of the text in question (it's common for a given selector to be used multiple times within a single file, e.g., conda-forge's
conda_buid_config.yaml
).Checklist - did you ...
news
directory (using the template) for the next release's release notes?Add / update outdated documentation?