-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CollectScenes : Multithread hashSet()
and computeSet()
#4967
CollectScenes : Multithread hashSet()
and computeSet()
#4967
Conversation
The one thing I don't like about this is that the hash now depends on how tbb chooses to split the range ... is there something that gives us confidence that this will never change? Do we actually need to do this in an ordered way? The fact that we use a non-deterministic parallel reduce for the compute suggests that an element is probably uniquely identified by the value of |
The docs here provide enough confidence for me : Is there anything there that concerns you?
We don't need to - the sum-of-hashes approach would work here, and I did consider it. But since we know our |
Note to self : before merging this, you need to consider the implications of the TaskCollaboration fixes that are in progress. I suspect there is a special case that needs to be handled when two CollectScenes are chained and both have an empty set of rootNames. |
Oh, sorry, I wasn't thinking too clearly, and somehow thought deterministic_reduce meant ordered, whereas of course both versions are effectively ordered, and deterministic means ... deterministic. This should be fine. I don't think there would be anything wrong with the adding approach either. ( As a brief aside, since order does not in fact matter, I don't think we're actually getting " stronger mixing of the hashes" in any sense that matters. This is a bit tricky to prove, I think it can be proven for an ideal hash function, but obviously our hash function is not ideal ... could there exist a hash function which is strong enough when not adding together, but somehow breaks down when adding? My guess is no, but that would be quite a CS paper to prove. But anyway, we assume in other places that summing hashes is fine. ). But I don't think there's any reason to change this code, other than your note about considering implications of TaskCollaboration stuff. |
4557213
to
2982f2e
Compare
Since I wrote this we merged a superior fix for the TaskCollaboration-hash-aliasing problem, so I think this PR is good to go. I've rebased read for merging to |
This gives a 3x speedup in `CollectScenesTest.testSetPerformance`. Perhaps more interestingly, it also gives almost a 2x speedup in `SetQueryTest.testScaling()`, but almost half of that speedup is due to the change in hash cache policy _alone_. In `testScaling()`, the same set is required by every location in the scene, but we are visiting enough locations that the hash cache is under significant pressure. By moving `out.set` to the TaskCollaboration policy, the set hash is stored in the shared central cache, from which it is exceedingly unlikely to be evicted (because per-location hashes are not stored in the global cache).
2982f2e
to
677ca05
Compare
LGTM |
This gives a 3x speedup in
CollectScenesTest.testSetPerformance
.Perhaps more interestingly, it also gives almost a 2x speedup in
SetQueryTest.testScaling()
, but almost half of that speedup is due to the change in hash cache policy alone. IntestScaling()
, the same set is required by every location in the scene, but we are visiting enough locations that the hash cache is under significant pressure. By movingout.set
to the TaskCollaboration policy, the set hash is stored in the shared central cache, from which it is exceedingly unlikely to be evicted (because per-location hashes are not stored in the global cache).