-
Notifications
You must be signed in to change notification settings - Fork 928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up "performance allocators" and "performance flate2" backends #7000
Conversation
Round 2: miette's
|
Part of #5711 |
c6436c1
to
8a18986
Compare
CodSpeed Performance ReportMerging #7000 will not alter performanceComparing Summary
|
b37f41a
to
a312719
Compare
crates/uv/Cargo.toml
Outdated
default = ["flate2/zlib-ng", "python", "pypi", "git"] | ||
default = ["python", "pypi", "git"] | ||
# Use better memory allocators, etc. — also turns-on self-update. | ||
production = ["self-update", "production-memory-allocator", "production-flate2-backend", "uv-distribution/production"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should keep self-update
separate, since re-distributors will likely want to run with --features production
, but won't want to enable self-update
. self-update
is only applicable when you install via our dedicated installers, not via brew
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I just re-separated them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we going to need to include a note to redistributors in the changelog for the --features production
flag? Does that suggest we should reserve this change for a breaking release?
3f9adf7
to
da7021d
Compare
.github/workflows/build-binaries.yml
Outdated
@@ -121,7 +121,7 @@ jobs: | |||
uses: PyO3/maturin-action@v1 | |||
with: | |||
target: aarch64 | |||
args: --release --locked --out dist --features self-update | |||
args: --release --locked --out dist --features production |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this and the reference on line 82 also need self-update
, unless I'm misreading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed! Also adjusted the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me! Probably good to get @BurntSushi eyes on it too since it will also affect profiling etc.
Nice! |
The decrease in build times here is considerable and a nice win. Nice work. In terms of how this is setup (feature configuration and the extra crates), that all makes sense to me. Charlie predicted my main concern here: the default build is "divorced" from the build we ship to users. We kind of already have this problem with our Cargo profiles: our I think this PR probably exacerbates that, because building local binaries for performance improvements will now also require, I believe, But the build improvements here are considerable. However...
If the improvement here is ~23% on cold builds, do we have a sense of what kind of improvement we get on warm builds? My feeling is that for this class of improvement, the warm builds probably matter a lot more than cold builds. And for the dependencies removed here, I wouldn't expect them to be getting rebuilt all of the time. So I'd be curious if this change improves warm build times to the point of being worth the potential footgun here. Separately, it's probably worth trying to find a different way of avoiding this footgun so that we can more confidently introduce a divergence between "dev builds" and "profiling builds" and "release builds." |
I must admit I worked on this PR under the assumption that:
Even if y'all decide you do not want to go down that road, there's something to salvage in that PR imho: the whole "shim crate to let cargo pick the dependencies/feature flags we want depending on the target platform" thing (and deduplicating the allocator setup between uv & uv-dev).
We don't! I guess let's measure that before deciding the fate of this PR first? My experience in speeding up builds is that cargo rebuilds a lot more than it should, a lot more often than it should. And even when it doesn't build something, just having a large dependency graph involves a lot of, well, fetching, hashing, etc. — as you're well aware, uv does that too! I'll report back with data on incremental builds (including switching between check/clippy/build/running tests — some of which I suspect trashes the target dir and causes cargo to rebuild too much) but am already mentally prepared into salvaging this PR into a "mostly cleanups" one as I may have underestimated just how much of astral's work focuses on performance alone. |
I think it comes in waves. I haven't done any perf work in a while since my focus has been on the functionality/correctness of the multi-platform resolver. But I have done a lot of perf work in the past and hope to do more in the future. The cleanups/refactoring makes sense. And looking at the impact on "warm" builds makes sense too. I know for me at least, I do a lot of work in |
f42d56b
to
973ecf4
Compare
As per comments on astral-sh#7000, since a lot of uv work is focused on performance, it makes sense to keep those enabled by default. However, it's still nice to have everything in one place.
a77334b
to
fb3e837
Compare
Instead of having conditional cargo dependencies in both uv-dev and uv, and different `--features` invocations in CI, this introduces a series of `uv-performance-*` crates that do the right thing. These are enabled by default, but can be disabled when working on correctness alone, locally.
fb3e837
to
e89f275
Compare
This PR is now more about "simplifying the logic to pull in performance allocators/flate2 backends" (it still removes the 'backtrace' feature of miette), and less about removing dependencies by default. The cc @BurntSushi for a second review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
(I’m out sick + traveling so if someone wants to take over this PR to get it rebased and merged, I would owe them a drink, at the very least!) |
Rebase went through trivially (uv-publish addition), made a new PR due to permissions: #7686 |
Summary
@charliermarsh has long suspected local builds could be made faster by disabling things like: tikv-jemalloc/mimalloc, zlibng etc.
I'm going through the cargo dep tree looking at things that can be disabled locally.
Methodology:
production
cargo flag enables all the production stuff (good allocators, fast compression libs, etc.)I measure fresh
cargo check
runs, like so:rm -rf /tmp/timings; CARGO_TARGET_DIR=/tmp/timings cargo check -F production-memory-allocator --timings
Varying the
-F
to enable/disable the featuresFAQ
Q: Why only check
check
?A: The benefits will trickle down to other subcommands (including test/nextest etc.) — check/clippy are super common while iterating. We can do larger checks near the end.
Q: Why only check cold builds?
A: Warm builds depend on a lot on which part of the code is touched — I'll optimize typical interactions later on.