feat!: let DependencyProvider prioritize dependencies #50

Eh2406 · 2020-10-23T19:56:14Z

This follows up on #40.

The first commit explores pick_package approach as described in #40 (comment).
The second commit explores the make_decision approach.
The third commit uses a BTreeMap to remove Hash req on OfflineDependencyProvider

Beadback is definitely appreciated.

Edit: somewhat accidentally closes #18

mpizenberg

I personally like the complete control given back to the dependency provider with make_decision. The usage of the BTree is also a good idea to remove the hash constraint. The trick with Borrow to enforce a correct package is also really nice.

There are few oddities though in package docs. Names that do not make a lot of sense anymore like the "reverse" terminology for the dependency provider. And a few other things that I found weird at first read like the error when the version is in the term (should be the opposite?).

src/solver.rs

Eh2406 · 2020-10-24T02:44:23Z

I force pushed to fix the commit names. I also addressed @mpizenberg's comments. I have 2 remaining concerns:

The module documentation still refers to list_available_versions. I could use some help describing the new method.
I added a new trait Constraints to hide Term is this a good idea? Alternatively we can make Term public with only the one method available, or as we are filtering to positive terms we can convert to a Range which is already public.

mpizenberg · 2020-10-24T22:38:29Z

Alternatively we can make Term public with only the one method available, or as we are filtering to positive terms we can convert to a Range which is already public

Good point! Since potential packages are only picked if there is at least one positive derivation, the intersection is necessarily a positive term. Using a Range in the public argument type would make more sense from the point of view of the dependency provider!

Eh2406 · 2020-10-25T03:43:13Z

switched to Range. I (at this time of night) could not get the lifetimes to work for &Range so used Borrow<Range<V>>

src/solver.rs

mpizenberg

Looks good to me, with maybe some changes to some doc comments. Waiting on others reviews

src/solver.rs

Eh2406 · 2020-10-26T21:40:01Z

with maybe some changes to some doc comments

What did you have in mind? Also:

The module documentation still refers to list_available_versions. I could use some help describing the new method.

aleksator · 2020-10-26T21:42:38Z

Hopefully I'll be able to look at PRs tomorrow or the day after.

aleksator · 2020-10-28T19:21:19Z

src/solver.rs

+    fn make_decision<T: Borrow<P>, U: Borrow<Range<V>>>(
+        &self,
+        packages: impl Iterator<Item = (T, U)>,
+    ) -> Result<(T, Option<V>), Box<dyn Error>>;


I propose to provide a default implementation for this method with the heuristic we were using previously.

This way we still give control to the users who want it, but provide the option to leave the way we pick packages as the library suggests.

Do you want to keep the list_available_versions? Because without that I don't know how to build a default implementation. It was removed as we otherwise don't need it.

I don't think we should keep list_available_versions in the interface. Maybe we could have a free function (or collection of free functions) that could help. Something like pick_first_strategy, pick_lowest_number_of_versions_strategy etc. The names are terrible, but that's just for the sake of the example

I was sceptical, but gave it a try. And it worked out really well! Take a look and let me know what you think.

@Eh2406

Do you want to keep the list_available_versions?

@mpizenberg

I don't think we should keep list_available_versions in the interface.

Why not? What if our interface is 3 functions:

make_decision()

list_available_versions()

get_dependencies()

The first one has default implementation in terms on list_available_versions, no need to change it right away for the algorithm to work but the ability to customize it is there.

The idea is somewhat similar to what iterators to with the next() function. You define it and get some others for free. Here we would implement 1) in terms of 2).

The only not so pretty thing about that is if the user comes up with a way to redefine make_decision without using list_available_versions, but the API still requires it. In that case they could leave it unimplemented!. Is it an API smell? Hmmm.

I don't think "avoid the breaking change" justifies that smell given our user base

My motivation is not avoiding breaking changes, but rather the easiness of API usage.

Before this PR it was crystal clear what the algorithm needed from the user to function (from the user's perspective):

A way to get dependencies for a given package version.

A way to get all versions for the package.

All makes sense why it is needed and how to implement it.

Now:

Unchanged.

Hm, actually you need to implement part of our algorithm to use our library.

How would people do that? What if they don't want to understand how exactly we do the decision making?

The only practical solution is to copy paste the code from OfflineDependencyProvider.
I'm afraid that requiring every user to do that is too much of an ergonomic hit.

Does anyone else share my concerns?

I get your point about ease of use. However, I think people not wanting to use OfflineDependencyProvider and willing to define their own dependency provider are willing to read the implementation of an 11-lines function doing nothing more than a function call to a helper function. Also there is an even simpler 6-lines example implementation in the guide on the section regarding custom implementation of a dependency provider.

If we get more users, and some of them point make_decision as being difficult to implement, I think I'd rather have a two different traits with different ease/control compromises (a bit like the sandbox/element/document/application variants of an elm app) than having a single trait with an ambiguous API.

If we get more users, and some of them point make_decision as being difficult to implement, I think I'd rather have a two different traits with different ease/control compromises (a bit like the sandbox/element/document/application variants of an elm app) than having a single trait with an ambiguous API.

Actually that's what my girlfriend suggested as well! 😆 I was pretty impressed because she's sophomore (2nd year in the uni) and is doing that as a way to change her profession, meaning she has very little experience in programming yet.

In any case, doing that is out of scope of this PR and needs a separate discussion; I'm totally okay merging this version!

I'm glad we had this discussion as this is an interesting design problem I haven't had to think of before, thank you @Eh2406 and @mpizenberg.

you have a wise girlfriend @aleksator ahah, keep her! XD

src/solver.rs

tests/proptest.rs

aleksator · 2020-10-28T19:41:25Z

This does simplify the code, but the performance characteristics are different. In the first case, we only need to browse terms for any positive before computing intersection. In the refactor we always compute the intersection. If benchmarking showcases similar performances, it's worth it. Otherwise let's leave it for another round of performance adjustments.

On the other hand, we don't iterate over not intersected derivations every time.

I run the benchmarks a few times before submitting the change.
Before the change it has completed in 1.65ms on my machine, after the change in 1.67ms. I'm not sure if this is spurious or not, since when I benchmarked something else before I was getting much larger variations without any code changes.

Eh2406 · 2020-10-28T20:54:12Z

When I ran the benchmarks from #34, criterion was fairly sure there is a small regression comparing ecaeb39 with acc391e. It is made smaller if we eagerly do the intersections and go ahead and do away with derivations_not_intersected_yet entirely. But criterion thinks it is still a 3-5% regression.
It is a code complexity vs perf trade off, I'd be ok either way.

aleksator · 2020-10-28T22:01:14Z

When I ran the benchmarks from #34, criterion was fairly sure there is a small regression comparing ecaeb39 with acc391e. It is made smaller if we eagerly do the intersections and go ahead and do away with derivations_not_intersected_yet entirely. But criterion thinks it is still a 3-5% regression.
It is a code complexity vs perf trade off, I'd be ok either way.

I've restored "if conditions" and still simplified the code a little with an early return.

src/lib.rs

tests/proptest.rs

mpizenberg · 2020-11-13T22:56:08Z

The last commit makes the code cleaner by using min_by but I think it doubles the amount of calls to list_available_versions so don't hesitate to revert that one. It's weird that there is not function like min_with that would take a F: Fn(&Self::Item) -> impl Ord or similar.

Eh2406 · 2020-11-13T23:00:14Z

Dose min_by_key work here?

mpizenberg · 2020-11-13T23:00:25Z

Just after that comment I realized the function exists XD and is called min_by_key.

aleksator · 2020-11-14T20:19:39Z

I've reviewed the PR so far, great work introducing the ability to customize the lib's behavior.

The one thing I wanted to discuss about the API change is a default implementation for make_decision. We can leave that discussion for a potential separate PR if you'd like.

tests/proptest.rs

aleksator · 2020-11-14T20:48:30Z

tests/proptest.rs

+struct OldestVersionsDependencyProvider<P: Package, V: Version>(OfflineDependencyProvider<P, V>);
+
+impl<P: Package, V: Version> DependencyProvider<P, V> for OldestVersionsDependencyProvider<P, V> {
+    fn make_decision<T: std::borrow::Borrow<P>, U: std::borrow::Borrow<Range<V>>>(


What about naming this [pick/choose]_next_package?

Could have been possible. make_decision fits well with PubGrub terminology since this step is called "decision making"

That's true. The question is who do we optimize the easiness to understand for:

library authors

users

I think make_decision optimizes for us, while naming function based on what it does optimizes for users.

What if we use descriptive function name from the user perspective and put a link to "decision making" in the function docs?

Yeah I don't know, tough naming call. If you prefer explicitness from the user point of view, maybe choose_package_version_within is the most descriptive name? I don't mind if you go for it, but can you also update the guide then please (PR #45).

https://github.com/dart-lang/pub/blob/master/lib/src/solver/version_solver.dart#L320 😄

I went with choose_package_version; how does it look now?

I haven't yet read the guide as I'm focusing on PR reviews first. Will update once I get there 🙂

Eh2406 · 2020-11-15T04:39:59Z

Using the benchmark of solving all versions in elm-packages.ron, criterion thinks this PR makes a 19% improvement from 2.2sec -> 1.8sec.
Solving for all versions in the checked in large_case.ron criterion thinks this PR makes a 23% improvement.

aleksator · 2020-11-15T13:19:09Z

@Eh2406 That's a lot! Must be usage of iterators instead of allocating new vectors for list_available_versions that was done before?

@Eh2406 @mpizenberg
No blockers from me anymore, let's merge this if you agree with the naming changes in the latest commit discussed in #50 (comment).
Feel free to resolve and merge.

There is also an interesting discussion on future possible API changes, where @mpizenberg thought of 2 separate traits for customizability/ease of use: #50 (comment)
This is in no way a blocker for this PR, just something we can keep in mind and possibly implement in the future if we desire to do so. I thought it's pretty clever 😄

src/solver.rs

Eh2406 · 2020-11-15T14:32:38Z

Must be usage of iterators instead of allocating new vectors for list_available_versions that was done before?

I think so. Allocating a BTreeSet (in versions) and Vec (in list_available_versions) just to get a count for all available packages for each decision adds up.

@mpizenberg is better at naming than I, so I will leave final approval to him.

mpizenberg

Alright congrats all on finalizing that PR!

mpizenberg · 2020-11-15T20:41:55Z

@Eh2406 There is a conflict and I'm in the middle of some uncommitted things from the old dev branch for my analysis of elm packages. Could you do the merge? Don't hesitate to squash everything before rebase since that means only 1 conflictual commit.

Eh2406 · 2020-11-16T02:39:11Z

Did a squash and a rebase. I think nothing was lost in that, but just in case I will leave it for one of you to merge.

aleksator · 2020-11-16T11:30:39Z

There was a conflict because of my merged PR; fixed.

mpizenberg reviewed Oct 24, 2020

View reviewed changes

src/solver.rs Outdated Show resolved Hide resolved

src/solver.rs Outdated Show resolved Hide resolved

src/solver.rs Show resolved Hide resolved

Eh2406 force-pushed the priorities branch from d49ddf2 to 5e0d57d Compare October 24, 2020 02:36

Eh2406 force-pushed the priorities branch 2 times, most recently from 8357220 to 8ae3d66 Compare October 24, 2020 16:06

Eh2406 force-pushed the priorities branch from 8ae3d66 to a5fbf06 Compare October 25, 2020 03:39

Eh2406 force-pushed the priorities branch from a5fbf06 to 1513bff Compare October 25, 2020 03:44

Eh2406 commented Oct 26, 2020

View reviewed changes

src/solver.rs Outdated Show resolved Hide resolved

mpizenberg approved these changes Oct 26, 2020

View reviewed changes

src/solver.rs Outdated Show resolved Hide resolved

Eh2406 force-pushed the priorities branch from 1513bff to ecaeb39 Compare October 26, 2020 21:42

aleksator reviewed Oct 28, 2020

View reviewed changes

src/solver.rs Outdated Show resolved Hide resolved

aleksator reviewed Oct 28, 2020

View reviewed changes

tests/proptest.rs Outdated Show resolved Hide resolved

Eh2406 mentioned this pull request Oct 29, 2020

"lockfiles" pubgrub-rs/advanced_dependency_providers#4

Open

Eh2406 force-pushed the priorities branch 2 times, most recently from 84409c1 to 3666886 Compare October 30, 2020 19:31

mpizenberg approved these changes Oct 30, 2020

View reviewed changes

src/lib.rs Outdated Show resolved Hide resolved

tests/proptest.rs Outdated Show resolved Hide resolved

Eh2406 force-pushed the priorities branch from 3666886 to 9c10fc1 Compare October 31, 2020 01:06

Eh2406 force-pushed the priorities branch 2 times, most recently from e852586 to eb3d094 Compare November 12, 2020 21:48

mpizenberg force-pushed the priorities branch 2 times, most recently from fbf8ff4 to 47c055f Compare November 13, 2020 22:59

mpizenberg approved these changes Nov 13, 2020

View reviewed changes

aleksator force-pushed the priorities branch 2 times, most recently from 2944c58 to 15a3bcd Compare November 14, 2020 20:25

aleksator requested changes Nov 14, 2020

View reviewed changes

Eh2406 force-pushed the priorities branch from 15a3bcd to aafb1d7 Compare November 15, 2020 04:12

aleksator approved these changes Nov 15, 2020

View reviewed changes

mpizenberg requested changes Nov 15, 2020

View reviewed changes

src/solver.rs Outdated Show resolved Hide resolved

aleksator changed the title ~~let dependency provider priorities packages~~ feat!: let DependencyProvider prioritize dependencies Nov 15, 2020

mpizenberg approved these changes Nov 15, 2020

View reviewed changes

Eh2406 force-pushed the priorities branch from a19f546 to 5e34bbd Compare November 16, 2020 02:37

feat!: let DependencyProvider prioritize dependencies

6b6bbd6

aleksator force-pushed the priorities branch from 5e34bbd to 6b6bbd6 Compare November 16, 2020 11:27

aleksator merged commit 3d8fd9d into dev Nov 16, 2020

aleksator deleted the priorities branch November 16, 2020 11:32

feat!: let DependencyProvider prioritize dependencies #50

feat!: let DependencyProvider prioritize dependencies #50

Conversation

Eh2406 commented Oct 23, 2020 • edited Loading

mpizenberg left a comment

Choose a reason for hiding this comment

Eh2406 commented Oct 24, 2020

mpizenberg commented Oct 24, 2020

Eh2406 commented Oct 25, 2020

mpizenberg left a comment

Choose a reason for hiding this comment

Eh2406 commented Oct 26, 2020

aleksator commented Oct 26, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aleksator Nov 14, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aleksator commented Oct 28, 2020

Eh2406 commented Oct 28, 2020

aleksator commented Oct 28, 2020

mpizenberg commented Nov 13, 2020

Eh2406 commented Nov 13, 2020

mpizenberg commented Nov 13, 2020

aleksator commented Nov 14, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Eh2406 commented Nov 15, 2020

aleksator commented Nov 15, 2020

Eh2406 commented Nov 15, 2020

mpizenberg left a comment

Choose a reason for hiding this comment

mpizenberg commented Nov 15, 2020

Eh2406 commented Nov 16, 2020

aleksator commented Nov 16, 2020

Eh2406 commented Oct 23, 2020 •

edited

Loading

aleksator Nov 14, 2020 •

edited

Loading