forked from pubgrub-rs/pubgrub
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return incompatibility from conflict & Allow backtracking to before a specific package #36
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CodSpeed Performance ReportCongrats! CodSpeed is installed 🎉
You will start to see performance impacts in the reports once the benchmarks are run from your default branch.
|
konstin
changed the title
Return Id<DP::P> over &ID<DP::P>
Return incompatibility from conflict & Allow backtracking to before a specific package
Dec 12, 2024
Whenever we either discard a version due to its dependencies or perform conflict resolution, we return the last conflict that led to discarding them. In uv, we use this to re-prioritize and backtrack when a package decision accumulated to many conflicts.
This allows discarding a previously made decision if it turned out to be a bad decision, even if all options with this decisions have not yet been rejected. We allow attempting to backtrack on packages that were not decided yet to avoid the caller from making the duplicative check.
Allow introspecting which packages are to be decided. uv currently uses this for (trace) logging in the resolver.
konstin
force-pushed
the
konsti/backtrack-package
branch
from
December 13, 2024 11:38
036aab4
to
251e93b
Compare
konstin
added a commit
that referenced
this pull request
Dec 16, 2024
This PR is the child of #36 and pubgrub-rs#291, providing an implementation that works for both cargo and uv. Upstream PR: pubgrub-rs#298. Specifically, we use the returned incompatibility in astral-sh/uv#9843, but not `PackageResolutionStatistics`. --- Whenever we either discard a version due to its dependencies or perform conflict resolution, we return the last conflict that led to discarding them. In cargo, we use this information for prioritization, which speeds up resolution (`cargo run -r -- -m pub --with-solana --filter solana-archiver-lib -t 16` goes from 90s to 20s on my machine). Configurations that are noticeably slower for the solana test case: * All incompatibilities unit propagation * Only the last root cause in unit propagation * No incompatibilities from unit propagation * No incompatibilities from `add_version` * Only affect counts (without culprit counts) * Backtracking with the same heuristic as astral-sh/uv#9843 (backtracking once after affected hits 5) In uv, we use this to re-prioritize and backtrack when a package decision accumulated to many conflicts. Since we have our own solver loop, we add the incompatibility to our own tracking instead. Built on pubgrub-rs#291 ## Benchmarks Main: ``` index commit hash: 82086e46740d7a9303216bfac093e7268a95121f index commit time: 2024-11-30T18:18:14Z index size: 32 solana in index: 32 Pub CPU time: 1215.49s == 20.26min Cargo CPU time: skipped Cargo check lock CPU time: skipped Pub check lock CPU time: skipped Wall time: 80.58s == 1.34min ``` With pubgrub-rs#291: ``` index commit hash: 82086e46740d7a9303216bfac093e7268a95121f index commit time: 2024-11-30T18:18:14Z index size: 32 solana in index: 32 Pub CPU time: 467.73s == 7.80min Cargo CPU time: skipped Cargo check lock CPU time: skipped Pub check lock CPU time: skipped Wall time: 34.76s == 0.58min ``` This PR: ``` index commit hash: 82086e46740d7a9303216bfac093e7268a95121f index commit time: 2024-11-30T18:18:14Z index size: 32 solana in index: 32 Pub CPU time: 271.79s == 4.53min Cargo CPU time: skipped Cargo check lock CPU time: skipped Pub check lock CPU time: skipped Wall time: 20.17s == 0.34min ```
konstin
added a commit
to astral-sh/uv
that referenced
this pull request
Dec 16, 2024
Background reading: #8157 Companion PR: astral-sh/pubgrub#36 Requires for test coverage: astral-sh/packse#230 When two packages A and B conflict, we have the option to choose a lower version of A, or a lower version of B. Currently, we determine this by the order we saw a package (assuming equal specificity of the requirement): If we saw A before B, we pin A until all versions of B are exhausted. This can lead to undesirable outcomes, from cases where it's just slow (sentry) to others cases without lower bounds where be backtrack to a very old version of B. This old version may fail to build (terminating the resolution), or it's a version so old that it doesn't depend on A (or the shared conflicting package) anymore - but also is too old for the user's application (fastapi). #8157 collects such cases, and the `wrong-backtracking` packse scenario contains a minimized example. We try to solve this by tracking which packages are "A"s, culprits, and "B"s, affected, and manually interfering with project selection and backtracking. Whenever a version we just chose is rejected, we give the current package a counter for being affected, and the package it conflicted with a counter for being a culprit. If a package accumulates more counts than a threshold, we reprioritize: Undecided after the culprits, after the affected, after packages that only have a single version (URLs, `==<version>`). We then ask pubgrub to backtrack just before the culprit. Due to the changed priorities, we now select package B, the affected, instead of package A, the culprit. To do this efficiently, we ask pubgrub for the incompatibility that caused backtracking, or just the last version to be discarded (due to its dependencies). For backtracking, we use the last incompatibility from unit propagation as a heuristic. When a version is discarded because one of its dependencies conflicts with the partial solution, the incompatibility tells us the package in the partial solution that conflicted. We only backtrack once per package, on the first time it passes the threshold. This prevents backtracking loops in which we make the same decisions over and over again. But we also changed the priority, so that we shouldn't take the same path even after the one time we backtrack (it would defeat the purpose of this change). There are some parameters that can be tweaked: Currently, the threshold is set to 5, which feels not too eager with so me of the conflicts that we want to tolerate but also changes strategies quickly. The relative order of the new priorities can also be changed, as for each (A, B) pair the priority of B is afterwards lower than that for A. Currently, culprits capture conflict for the whole package, but we could limit that to a specific version. We could discard conflict counters after backtracking instead of keeping them eternally as we do now. Note that we're always taking about pairs (A, B), but in practice we track individual packages, not pairs. A case that we wouldn't capture is when B is only introduced to the dependency graph after A, but I think that would require cyclical dependency for A and B to conflict? There may also be cases where looking at the last incompatibility is insufficient. Another example that we can't repair with prioritization is urllib3/boto3/botocore: We actually have to check all the newer versions of boto3 and botocore to identify the version that allows with the older urllib3, no shortcuts allowed. ``` urllib3<1.25.4 boto3 ``` All examples I tested were cases with two packages where we only had to switch the order, so I've abstracted them into a single packse case. This PR changes the resolution for certain paths, and there is the risk for regressions. Fixes #8157 --- All tested examples improved. Input fastapi: ```text starlette<=0.36.0 fastapi<=0.115.2 ``` ``` # BEFORE $ uv pip --no-progress compile -p 3.11 --exclude-newer 2024-10-01 --no-annotate debug/fastapi.txt annotated-types==0.7.0 anyio==4.6.0 fastapi==0.1.17 idna==3.10 pydantic==2.9.2 pydantic-core==2.23.4 sniffio==1.3.1 starlette==0.36.0 typing-extensions==4.12.2 # AFTER $ cargo run --profile fast-build --no-default-features pip compile -p 3.11 --no-progress --exclude-newer 2024-10-01 --no-annotate debug/fastapi.txt annotated-types==0.7.0 anyio==4.6.0 fastapi==0.109.1 idna==3.10 pydantic==2.9.2 pydantic-core==2.23.4 sniffio==1.3.1 starlette==0.35.1 typing-extensions==4.12.2 ``` Input xarray: ```text xarray[accel] ``` ``` # BEFORE $ uv pip --no-progress compile -p 3.11 --exclude-newer 2024-10-01 --no-annotate debug/xarray-accel.txt bottleneck==1.4.0 flox==0.9.13 llvmlite==0.36.0 numba==0.53.1 numbagg==0.8.2 numpy==2.1.1 numpy-groupies==0.11.2 opt-einsum==3.4.0 packaging==24.1 pandas==2.2.3 python-dateutil==2.9.0.post0 pytz==2024.2 scipy==1.14.1 setuptools==75.1.0 six==1.16.0 toolz==0.12.1 tzdata==2024.2 xarray==2024.9.0 # AFTER $ cargo run --profile fast-build --no-default-features pip compile -p 3.11 --no-progress --exclude-newer 2024-10-01 --no-annotate debug/xarray-accel.txt bottleneck==1.4.0 flox==0.9.13 llvmlite==0.43.0 numba==0.60.0 numbagg==0.8.2 numpy==2.0.2 numpy-groupies==0.11.2 opt-einsum==3.4.0 packaging==24.1 pandas==2.2.3 python-dateutil==2.9.0.post0 pytz==2024.2 scipy==1.14.1 six==1.16.0 toolz==0.12.1 tzdata==2024.2 xarray==2024.9.0 ``` Input sentry: The resolution is identical, but arrived at much faster: main tries 69 versions (sentry-kafka-schemas: 63), PR tries 12 versions (sentry-kafka-schemas: 6; 5 times conflicting, then once the right version). ```text python-rapidjson<=1.20,>=1.4 sentry-kafka-schemas<=0.1.113,>=0.1.50 ``` ``` # BEFORE $ uv pip --no-progress compile -p 3.11 --exclude-newer 2024-10-01 --no-annotate debug/sentry.txt fastjsonschema==2.20.0 msgpack==1.1.0 python-rapidjson==1.8 pyyaml==6.0.2 sentry-kafka-schemas==0.1.111 typing-extensions==4.12.2 # AFTER $ cargo run --profile fast-build --no-default-features pip compile -p 3.11 --no-progress --exclude-newer 2024-10-01 --no-annotate debug/sentry.txt fastjsonschema==2.20.0 msgpack==1.1.0 python-rapidjson==1.8 pyyaml==6.0.2 sentry-kafka-schemas==0.1.111 typing-extensions==4.12.2 ``` Input apache-beam ```text # Run on Python 3.10 dill<0.3.9,>=0.2.2 apache-beam<=2.49.0 ``` ``` # BEFORE $ uv pip --no-progress compile -p 3.10 --exclude-newer 2024-10-01 --no-annotate debug/apache-beam.txt × Failed to download and build `apache-beam==2.0.0` ╰─▶ Build backend failed to determine requirements with `build_wheel()` (exit status: 1) # AFTER $ cargo run --profile fast-build --no-default-features pip compile -p 3.10 --no-progress --exclude-newer 2024-10-01 --no-annotate debug/apache-beam.txt apache-beam==2.49.0 certifi==2024.8.30 charset-normalizer==3.3.2 cloudpickle==2.2.1 crcmod==1.7 dill==0.3.1.1 dnspython==2.6.1 docopt==0.6.2 fastavro==1.9.7 fasteners==0.19 grpcio==1.66.2 hdfs==2.7.3 httplib2==0.22.0 idna==3.10 numpy==1.24.4 objsize==0.6.1 orjson==3.10.7 proto-plus==1.24.0 protobuf==4.23.4 pyarrow==11.0.0 pydot==1.4.2 pymongo==4.10.0 pyparsing==3.1.4 python-dateutil==2.9.0.post0 pytz==2024.2 regex==2024.9.11 requests==2.32.3 six==1.16.0 typing-extensions==4.12.2 urllib3==2.2.3 zstandard==0.23.0 ```
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Background reading: astral-sh/uv#8157
I'll link the PR on the uv side with all the details once it's up, it contains a walkthrough.
This adds two features to pubgrub:
Return incompatibility from conflict: Whenever we either discard a version due to its dependencies or perform conflict resolution, we return the last conflict that led to discarding them. In uv, we use this to re-prioritize and backtrack when a package decision accumulated to many conflicts. Using the incompatibilities directly from pubgrub makes this efficient.
Allow backtracking to before a specific package: Once we identified a pair of conflicting packages, we want to switch their priorities. This requires backtracking to before the decision for the earlier of the two packages was made. On the uv side, we change the priorities to make sure we take a different path on the next attempt. We allow attempting to backtrack on packages that were not decided yet to avoid the caller from making the duplicative check: At the time when the see incompatibilities caused by them, they are not necessarily decided.
Best reviewed commit-by-commit.