-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lots of contention on org.eclipse.emf.common.util.Pool with many threads adding entries to the pool #50
Comments
Hi @merks, I would like to solve the described problem. Does the proposed approach seems reasonable to you? I am aware that the change will be scary, but not as much as the one in the PR I submit a moment ago by mistake, that one was just the state of the playground I am using for investigation :) |
How are you measuring? I don't know what you mean by different parts of the pool and I don't know what lock stripping means means concretely in this context. A Pool is a Set with contents that must unique. So, for example, org.eclipse.emf.common.util.URI.POOL must manage unique URIs. URI does not override .equals, relying on .equals being the same as ==. I just don't see how you are going to have a pool with parts. And of course you can't look at a framework and determine what parts are being used downstream; definitely there can be be no breaking changes. |
Hi Ed, we are measuring by running the application in a system with plenty resources (96 CPUs, lots of memory, etc.) and observing that the throughput does not behave well by increasing the number of threads. Our JFRs tells as that there is significant contention on the EMF pool, which makes sense to us because our threads create many URIs. To prove ourselves that we can improve performance we will not just look at JFRs, we will mainly measure if the performance improves or not. To lock the pool only partially by splitting the lock on the pool, my first approach is to use the the hashCode modulo number of locks (which is equal to the desired concurrencyLevel). I have a draft PR that I will publish now. I think that if done properly, it should not be a breaking change. Lock stripping, in my understanding is an extra step on lock split, by actually breaking the pool into different partitions and the having one lock on each partition, and then use for example hashCode modulo number of partitions to assign an entry to a partition. |
Coincidentally this morning in a jstack capture I did see 5 threads here:
This is also in an application with many threads and very enthusiastic about creating vast quantities of URIs. |
I can relate to that application :) |
We'll have to see how this progresses, but anything that's not private is API and EMF never breaks API... |
do you see anything in the current PR that breaks the API? If so, I have missed it. |
I think I'm really going to need to invest some significant time to determine if it's possible to achieve these results without removing or changing the types of protected members in classes that can be used as base classes by consumers... In the worst case, one might use copy and paste to create a new Pool class that's used internally and perhaps it make none of that visible to clients. I'm just not sure what's best and what's possible. Are you able to share any numbers that would motivate why invest my (personal) time would be of significant benefit? |
I agree that it is not easy to review. I plan to fix the existing race condition first, then I will do the measurements and present the results. I think it only make sense to invest significant time in the review afterwards. I have published the work I am doing to give a heads-up and get a first feeback (which I have now, thanks). |
Very good. I appreciate your investment of time. |
BTW, perhaps a less invasive (and non-binary compatible) approach would be to avoid trying to make Pool itself do lock splitting but rather to look at the derived Pool classes and providing two alternative implementations for them. E.g., if I look at URIPool, it's possible to factor out a base class and use that base class as the type of the POOL because only a small set of methods are called and these are not the methods of the Pool base class: So there is a small set of methods that could do the lock splitting perhaps simply delegating onto a set of existing URIPool instances. Probably one would want/need a base interface, not a base class. One could reuse the access units to get the hash code needed to delegate to the appropriate "subpool", and probably do some refactoring to avoid creating access units twice when delegating. The point is that all these subclasses are internal to the URI implementation and that one does not have to worry about how to make iteration, size, any number of other operations on a generate Pool be thread safe. Also, by simply reusing the existing class by default, there is less risk that the attempt to improve performance causes unexpected new problems... Just a thought... |
I'll record the base class here (which should probably be an interface in the end):
|
We have a multi-threaded application that shows a lot of contention on org.eclipse.emf.common.util.Pool when entries to the pool are added due to the fact that the pool has been made thread-safe using a single lock. Note: we have at least 64 threads in our application.
We believe that the most reasonable approach is to implement lock striping to solve the problem. The implementation would use a system property to get the desired concurrency level and the would use lock stripping accordingly. To be as backward compatible as possible, if the property is not set, the pool would still use a single lock.
So far I think the implementation is possible, but there is a problem with the methods Pool#getReadLock and Pool#getWriteLock, which obviously cannot work if there are several locks for different parts of the pool. So far I did not found any consumers, so I think the best approach would be to mark them as deprecated and thrown an UnsupportedOperation exception when concurrency level > 1.
The text was updated successfully, but these errors were encountered: