-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: let DependencyProvider prioritize dependencies #50
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally like the complete control given back to the dependency provider with make_decision
. The usage of the BTree is also a good idea to remove the hash
constraint. The trick with Borrow
to enforce a correct package is also really nice.
There are few oddities though in package docs. Names that do not make a lot of sense anymore like the "reverse" terminology for the dependency provider. And a few other things that I found weird at first read like the error when the version is in the term (should be the opposite?).
I force pushed to fix the commit names. I also addressed @mpizenberg's comments. I have 2 remaining concerns:
|
8357220
to
8ae3d66
Compare
Good point! Since potential packages are only picked if there is at least one positive derivation, the intersection is necessarily a positive term. Using a |
switched to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, with maybe some changes to some doc comments. Waiting on others reviews
What did you have in mind? Also:
|
Hopefully I'll be able to look at PRs tomorrow or the day after. |
fn make_decision<T: Borrow<P>, U: Borrow<Range<V>>>( | ||
&self, | ||
packages: impl Iterator<Item = (T, U)>, | ||
) -> Result<(T, Option<V>), Box<dyn Error>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose to provide a default implementation for this method with the heuristic we were using previously.
This way we still give control to the users who want it, but provide the option to leave the way we pick packages as the library suggests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to keep the list_available_versions
? Because without that I don't know how to build a default implementation. It was removed as we otherwise don't need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should keep list_available_versions
in the interface. Maybe we could have a free function (or collection of free functions) that could help. Something like pick_first_strategy
, pick_lowest_number_of_versions_strategy
etc. The names are terrible, but that's just for the sake of the example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was sceptical, but gave it a try. And it worked out really well! Take a look and let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to keep the list_available_versions?
I don't think we should keep list_available_versions in the interface.
Why not? What if our interface is 3 functions:
make_decision()
list_available_versions()
get_dependencies()
The first one has default implementation in terms on list_available_versions
, no need to change it right away for the algorithm to work but the ability to customize it is there.
The idea is somewhat similar to what iterators to with the next()
function. You define it and get some others for free. Here we would implement 1) in terms of 2).
The only not so pretty thing about that is if the user comes up with a way to redefine make_decision
without using list_available_versions
, but the API still requires it. In that case they could leave it unimplemented!
. Is it an API smell? Hmmm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think "avoid the breaking change" justifies that smell given our user base
My motivation is not avoiding breaking changes, but rather the easiness of API usage.
Before this PR it was crystal clear what the algorithm needed from the user to function (from the user's perspective):
- A way to get dependencies for a given package version.
- A way to get all versions for the package.
All makes sense why it is needed and how to implement it.
Now:
- Unchanged.
- Hm, actually you need to implement part of our algorithm to use our library.
How would people do that? What if they don't want to understand how exactly we do the decision making?
The only practical solution is to copy paste the code from OfflineDependencyProvider
.
I'm afraid that requiring every user to do that is too much of an ergonomic hit.
Does anyone else share my concerns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get your point about ease of use. However, I think people not wanting to use OfflineDependencyProvider
and willing to define their own dependency provider are willing to read the implementation of an 11-lines function doing nothing more than a function call to a helper function. Also there is an even simpler 6-lines example implementation in the guide on the section regarding custom implementation of a dependency provider.
If we get more users, and some of them point make_decision
as being difficult to implement, I think I'd rather have a two different traits with different ease/control compromises (a bit like the sandbox/element/document/application variants of an elm app) than having a single trait with an ambiguous API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we get more users, and some of them point make_decision as being difficult to implement, I think I'd rather have a two different traits with different ease/control compromises (a bit like the sandbox/element/document/application variants of an elm app) than having a single trait with an ambiguous API.
Actually that's what my girlfriend suggested as well! 😆 I was pretty impressed because she's sophomore (2nd year in the uni) and is doing that as a way to change her profession, meaning she has very little experience in programming yet.
In any case, doing that is out of scope of this PR and needs a separate discussion; I'm totally okay merging this version!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad we had this discussion as this is an interesting design problem I haven't had to think of before, thank you @Eh2406 and @mpizenberg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you have a wise girlfriend @aleksator ahah, keep her! XD
On the other hand, we don't iterate over not intersected derivations every time. I run the benchmarks a few times before submitting the change. |
When I ran the benchmarks from #34, criterion was fairly sure there is a small regression comparing ecaeb39 with acc391e. It is made smaller if we eagerly do the intersections and go ahead and do away with |
I've restored "if conditions" and still simplified the code a little with an early return. |
84409c1
to
3666886
Compare
e852586
to
eb3d094
Compare
The last commit makes the code cleaner by using |
fbf8ff4
to
47c055f
Compare
Dose |
Just after that comment I realized the function exists XD and is called |
I've reviewed the PR so far, great work introducing the ability to customize the lib's behavior. The one thing I wanted to discuss about the API change is a default implementation for |
2944c58
to
15a3bcd
Compare
tests/proptest.rs
Outdated
struct OldestVersionsDependencyProvider<P: Package, V: Version>(OfflineDependencyProvider<P, V>); | ||
|
||
impl<P: Package, V: Version> DependencyProvider<P, V> for OldestVersionsDependencyProvider<P, V> { | ||
fn make_decision<T: std::borrow::Borrow<P>, U: std::borrow::Borrow<Range<V>>>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about naming this [pick/choose]_next_package
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could have been possible. make_decision
fits well with PubGrub terminology since this step is called "decision making"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true. The question is who do we optimize the easiness to understand for:
- library authors
- users
I think make_decision
optimizes for us, while naming function based on what it does optimizes for users.
What if we use descriptive function name from the user perspective and put a link to "decision making" in the function docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I don't know, tough naming call. If you prefer explicitness from the user point of view, maybe choose_package_version_within
is the most descriptive name? I don't mind if you go for it, but can you also update the guide then please (PR #45).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with choose_package_version
; how does it look now?
I haven't yet read the guide as I'm focusing on PR reviews first. Will update once I get there 🙂
Using the benchmark of solving all versions in elm-packages.ron, criterion thinks this PR makes a 19% improvement from 2.2sec -> 1.8sec. |
@Eh2406 That's a lot! Must be usage of iterators instead of allocating new vectors for @Eh2406 @mpizenberg There is also an interesting discussion on future possible API changes, where @mpizenberg thought of 2 separate traits for customizability/ease of use: #50 (comment) |
I think so. Allocating a BTreeSet (in versions) and Vec (in list_available_versions) just to get a count for all available packages for each decision adds up. @mpizenberg is better at naming than I, so I will leave final approval to him. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright congrats all on finalizing that PR!
@Eh2406 There is a conflict and I'm in the middle of some uncommitted things from the old |
Did a squash and a rebase. I think nothing was lost in that, but just in case I will leave it for one of you to merge. |
5e34bbd
to
6b6bbd6
Compare
There was a conflict because of my merged PR; fixed. |
This follows up on #40.
The first commit explores
pick_package
approach as described in #40 (comment).The second commit explores the
make_decision
approach.The third commit uses a BTreeMap to remove Hash req on
OfflineDependencyProvider
Beadback is definitely appreciated.
Edit: somewhat accidentally closes #18