-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional testcase that show some unexpected behavior of resolver #469
base: master
Are you sure you want to change the base?
Conversation
4112c54
to
9ec0b67
Compare
How can one make this finally resolve is to replace the file
then it still takes some times but resolve in the first round (but still feels too slow) and now the chart looks like this: |
9ec0b67
to
4cbe2b7
Compare
4cbe2b7
to
a5a434d
Compare
I have now made some experiments and came up with the This itself does not solve the problem and is maybe not complete but I still think its worth to share it here as something to discuss, this currently strikes out for example some of the |
e161e88
to
3f7f9d4
Compare
I did this now and it strikes out one more candidate of a substitution package ( |
Next problem is now this kind of violations:
as one can see it looks like in the substitution step only one of the packages are substituted in a permutation but the uses is not taken into account. |
3f7f9d4
to
8bcd62a
Compare
After striking substitution packages out now still a lot of use constraint violations happen making the resolving explode:
the elk layered starts with this alternatives:
while xbase.lib has no alternatives at all but using a reexport of guava:
looking at this one can see that any bundle that requires |
8bcd62a
to
f41e8ef
Compare
Currently candidates of a substitution package might be permuted even though they are mandatory for others (e.g. they import a package that can only be fulfilled by this one capability). This now adds a ProblemReduction class that tries to figure out some conflicting options and remove them from the set of items to consider.
f41e8ef
to
b02fd1c
Compare
After filtering the reexport inconsistency now we have this use constraint violation:
Looking at the initial state the devhelp itself has no alternatives:
So in this case This will probably be challenging to solve but we finally now reached the point to the root error that is reported in this test-case ( equinox/bundles/org.eclipse.osgi/felix/src/org/apache/felix/resolver/ResolverImpl.java Lines 1532 to 1540 in 906f6d6
because the resolver can not really "blame" the system bundle ( Here is a chart of the current state and as one can see at laest the usesPermutations have already dropped below the initial threshold of about 1000 to 500 now: |
Just for testing I added a preliminary check that if the "system bundle" is a provider for a package strike out all other providers as an alternative for that, then the mentioned line is never hit and the resolver completes after 13 seconds with 272 iterations successfully with this search graph: as one can see uses permutations drop below 100 now (import permutations still growing) |
this is not to meant to be merged in its current form
The added testcase demonstrates some unexpected behavior of resolver we have already seen in the past eclipse simrel releases and I see regular with custom installations, also m2e suffers from long running resolve operations when starting test from the IDE.
To summarize the both things that hurt the most here are boil down to the following facts:
Undesired substitutions
V1.0
has a substitution packageA
and imports it with consumer range (e.g.[1,2)
) and export it with a specific version1.0
V1.5
that exports the packageA
with version1.5
exporting the package with version1.5
C
importingA
with range[1,2)
P
importingA
with range[1,1.5)
Under some circumstance now it happens that the package of bundle
V1.0
get substituted with these fromV1.5
resulting inP
not being resolved because the import package can not be resolved anymore.What is expected: the resolver does not substitute the package for
V1.0
because that will make it impossible forP
to resolve.FYI @tjwatson in the bundle test set I can provide to you this is the
org.eclipse.lsp4j.jsonrpc
bundle that gets packages substitutedUndesired binding to a higher package version causing use-constraint violations
A
importing a package with a consumer range (e.g.[1,2)
)B
importing a package with a narrow range (e.g.[1,1.5)
)B
requiresA
.Under some circumstances it now happens that
A
is bound to a higher version (1.8
) andB
is failed to resolve because there is a use constraint violation becauseB
is exposed to the same package in different versions, the bundleA
is actually not used by any other bundle at all.What is expected: the resolver does not bind
A
to a higher version if a consumer ofA
restrict it to a lower version already as it makes any consumer ofA
exposed to use-constraint violations.@tjwatson in the bundle test set I can provide to you this is the
org.eclipse.sprotty.server
bundle that gets packagescom.google.gson;version="[2.8,3)"
bound to a higher gson version than those bundles that requireorg.eclipse.sprotty.server
What we have done in the past
In simrel or similar situation we are currently forced to a very unfortunate situation that people (mostly @merks ) are forced to watch the dependencies and carefully examine the dependencies, asking projects to update their dependencies to "fix" such issues. While it is a lot of effort this has been working in the past but puts a lot of stress on the release process.
For plugin providers / product vendors this always leads to the very unfortunate situation that they are having hard times to update their products / plugins and still maintain a certain degree of backward compatibility with older eclipse releases or users that are at the bleeding edge. The main problem is that in almost all cases it is impossible to provide a "reproducing testcase" for the following reasons:
Why this is undesired behavior
The OSGi resolver is actually the entity that should solve complex dependency problems, it should not be the developer or release manager that needs to find out how to resolve the requirements best.
Even worse, in the case I have analyzed in deep here I can actually make the state resolve when I perform some manual steps, these steps are rather trivial operations (if one gets the idea) but the resolver burns a lot of CPU cycles and does not find a suitable solution and simply gave up after a timeout or even there are situations where it seem to run forever (> 30 minutes until I killed the process).
This can be especially frustrating if one looks at the "unsatisfied" requirements and see they are there, or has use-violations but can "see" the solution that must be chosen... a computer should even be better to spot such things
The "algorithm" to fix the resolve problem is implemented here:
but it has the drawback that it takes a considerable amount of time and there is no guarantee it always fixes the problem as it simply has no way to perform anything more than ask the resolver to try a little bit harder on something that seems to be part of the problem.
@tjwatson I'll contact you directly with how you can get the bundle test set in the hope that is helps, the test-case here actually resembles the problem quite good, it shows in first round a timeout after two minutes, then some bundles are selected for refresh, this again times out but with a different set of bundles, repeating this then leads with two more refreshes to a resolved state. It even seem not be much important how high the timeout is so basically a lot of time is spend here without making any progress.