-
Notifications
You must be signed in to change notification settings - Fork 818
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix 841 Part-2: C-Tree deepening #849
base: main
Are you sure you want to change the base?
Fix 841 Part-2: C-Tree deepening #849
Conversation
c28290f
to
808b32d
Compare
@andsel, before investigating further I'd like your opinion on this direction of artificial tree-deepening... The open issue is the list of subscriptions inside one node. When many clients subscribe to the same topic the same issue exists with a list that becomes too large. That would be trickier to solve, probably by building a tree based on the client-id? |
@hylkevds I'll review all the information, but I'm little bit slow. |
First of all, thank's a lot for your great investigation on very wide trees.
I think this is the right path to investigate it, ok that it's artificial, but test the tree operations varying one of the dimensions (the arity of the nodes, the branching factor).
Given that the problem is the copy, what if we use the principle (non mutable structure) that CTries embodies? |
It would help when we have a topic with may subscribers, and a new child-topic is added to that topic. On the other hand, when returning the list of subscriptions from a node, a copy is also made! This could be improved with the CoW list, since we could return (a read-only wrapper of) the list itself instead of a copy. Quite a few aspects to this issue... |
Right, correct the copy step on the CopyOnWrite list is heavy like the actual copy. |
Oh, that looks really interesting! That should be really simple to drop into the current code. |
My first test have some really interesting results. I've put them in a spreadsheet: The first two sheets are with many clients on few topics. This tests the
The third and fourth sheets test the CTree with a deep set of subscriptions. For this case there is little difference between the two, which is expected, since the CTree implementation is pretty optimised for deep trees. But performance also doesn't degrade! The last two sheets test the CTree with a really shallow and wide tree. Here performance is very similar, but slightly better with PCollections. The different becomes very clear when not artificially deepening the tree much. Without PCollections performance absolutely plummets when nodes get more then 9999 children. With PCollections perfomance still degrades, but much gentler, and only after nodes get more than 99999 children. |
@hylkevds could you specify what is in column From a first sight the about the
If I'm not wrong, does the plain list perform equal or better than the persistent collection? |
Column A is the maximum token length. When Topic.java cuts a topic into tokens it normally does this on the
Shorter tokens means a deeper tree, longer tokens a wider tree. A maximum token length of 1 makes many nodes with only 1 child, hence the bad performance.
No, lower is better. The numbers are how long it took to add 50_000 subscriptions, in milliseconds.
Yes, this surprises me. Maybe the test hits a worst-case scenario for the tree implementation of the persistent collections...
Indeed, since it essentially copies the entire list of clients into an array. We need to look if we can avoid this copy operation, since the persistent collections are persistent, so should not need to be copied.
The Persistent Collections are much better when adding, by a factor of 6. |
An important detail is that the performance of the PCollections doesn't worsen with increasing thread-count, like the array-implementation does. |
I found the issue with the remove speed: That brings the remove speed up to add-speed levels: ~20 - 30 ms for 20k subscriptions. Now to see if the read can be improved, and then go over the shared subscriptions. |
ca474d3
to
0642876
Compare
@andsel, Question: A big performance bottleneck is the method selectSubscriptionsWithHigherQoSForEachSession(List subscriptions) ? It turns a potential massive list of subscriptions into a map, and then back into a list, in the hope that this reduces the number of messages to be sent out...
This method is also the only reason a copy of the subscription list must be made. If we change the behaviour to send a message to each topic, we can pass the original subscription lists without ever making a copy. Even the selection of which client to message in case of shared subscriptions, could be done "late" without making a copy of the subscriptions list or map. What are your thoughts on this? |
Hi @hylkevds
At the time I opted for the one that move less traffic, so select only the highest QoS subscription for each client session. I think we could loosen this requirement, but we have to effectively understand how frequently happen an overlapping happen. Or could be something configurable with a setting. I could think about a user of the broker that till the previous version of such change received always (in case of overlapping subscription for same session) the highest QoS one, but after this change it receives multiple publishes for same topic at different QoS. |
In our use cases we never have overlapping topics, since the OGC SensorThings API doesn't allow wildcards. So I think I'm not the right person to ask :D. But you are absolutely correct in saying that changing behaviour between versions is not good. But, before going into all that, I first did some tests to see if it is even relevant. So I make a quick prototype and am working on the performance analysis. I hope to finish that soon! |
To test if the duplicate check makes a difference, I did some tests. Instead the CTree returning a I fixed the remove speed by changing the subscription collections to a Map based on the ClientId. Since a Client can only have 1 normal subscription per topic, putting them in a Map makes the most sense. After that I re-ran the performance tests and updated the GSheet. The TestsThere are three test setups, for each setup it gathers numbers for creating, reading and removing subscriptions:
|
As I can understand from the charts and the description, I would summarize as follow: Does you test consider also the multiple matching for same client and overlapping subscriptions case? |
The "Update" tests no longer de-duplicate messages for clients, so clients with multiple matching subscriptions get multiple messages. I've not yet added an option to turn that back on. In these tests, that would mainly influence the "1 Topic" tests, since that is the only one with multiple subscriptions per topic. I also wonder if it might help to cache the subscriptions for a topic in a List, to speed up subsequent iterations over that subscriptions list, until a subscription is added or removed again. |
I would avoid to add too much complexity if not needed. Maybe that would help for ultimate performance gain. |
e913366
to
b136fc5
Compare
This is a prototype / proof-of-concept, not ready for merging!
Issue #841 is about C-Tree performance. This prototype demonstrates both one of the issues and a possible solution to it.
The C-Tree is a tree that stores the children for a node in an Array. Because the tree is lock-free, it must make a copy of a node to add a child to that node. This involves copying the array with child-nodes. For flat trees, where the nodes have very many child-nodes (more than 1k) this is exceptionally slow. This slowness makes the lock-free solution worse, since the longer the copy action takes, the more likely it is that another thread has modified the node in the meantime, and thus the more likely it is that the copy action has to be re-done.
This PR artificially deepens the tree, to avoid having nodes with too many child-nodes. It does this by chopping long tokens into shorter tokens. The maximum token length is currently hard-coded in a constant, but should be configurable for a real implementation.
Testing with a numeric token means that a 1-character maximum token length results in a maximum of 10 child-nodes per node (0-9) while a 4 character maximum token length would result in a maximum of 10000 child nodes (0-9999).
Some initial testing on a single-threaded insert:
Max length, time for 500k subs, subs/second
1, 3777 ms, 132380/s
1, 3928 ms, 127291/s
1, 3441 ms, 145307/s
2, 1592 ms, 314070/s
2, 1585 ms, 315457/s
2, 1688 ms, 296209/s
3, 1788 ms, 279642/s
3, 1553 ms, 321958/s
3, 1560 ms, 320513/s
4, 1047 ms, 477555/s
4, 888 ms, 563063/s
4, 964 ms, 518672/s
5, 6039 ms, 82795/s
5, 6800 ms, 73529/s
5, 6849 ms, 73003/s
6, 195648 ms, 2556/s
6, 201065 ms, 2487/s
6, 200722 ms, 2491/s
From the initial numbers it is clear that
So there clearly is an optimum. Where this optimum lies is probably very much depending on the topic structure and the number of cores used.