CollectScenes : Multithread `hashSet()` and `computeSet()` #4967

johnhaddon · 2022-11-14T15:40:21Z

This gives a 3x speedup in CollectScenesTest.testSetPerformance.

Perhaps more interestingly, it also gives almost a 2x speedup in SetQueryTest.testScaling(), but almost half of that speedup is due to the change in hash cache policy alone. In testScaling(), the same set is required by every location in the scene, but we are visiting enough locations that the hash cache is under significant pressure. By moving out.set to the TaskCollaboration policy, the set hash is stored in the shared central cache, from which it is exceedingly unlikely to be evicted (because per-location hashes are not stored in the global cache).

danieldresser-ie · 2022-11-21T07:17:05Z

The one thing I don't like about this is that the hash now depends on how tbb chooses to split the range ... is there something that gives us confidence that this will never change?

Do we actually need to do this in an ordered way? The fact that we use a non-deterministic parallel reduce for the compute suggests that an element is probably uniquely identified by the value of root, and if we hashed the 3 things we're hashing for each element, that would probably give us something that we could just sum, without caring about order? And then we could use a non-deterministic reduce for the hash as well, and the hash wouldn't be dependent on how many chunks it's computed in?

johnhaddon · 2022-11-21T15:25:32Z

The one thing I don't like about this is that the hash now depends on how tbb chooses to split the range ... is there something that gives us confidence that this will never change?

The docs here provide enough confidence for me :

https://spec.oneapi.io/versions/1.0-rev-1/elements/oneTBB/source/algorithms/functions/parallel_deterministic_reduce_func.html

Is there anything there that concerns you?

Do we actually need to do this in an ordered way?

We don't need to - the sum-of-hashes approach would work here, and I did consider it. But since we know our RootRange up front anyway, deterministic_reduce gives us a simpler implementation with stronger mixing of the hashes. Let me turn the question on its head : what benefit would the sum-of-hashes approach provide here? For me, the only benefit of sum-of-hashes is that it works when you don't know the range upfront, like when doing parallelProcessLocations().

johnhaddon · 2022-11-23T16:40:23Z

Note to self : before merging this, you need to consider the implications of the TaskCollaboration fixes that are in progress. I suspect there is a special case that needs to be handled when two CollectScenes are chained and both have an empty set of rootNames.

danieldresser-ie · 2022-11-23T22:33:56Z

Oh, sorry, I wasn't thinking too clearly, and somehow thought deterministic_reduce meant ordered, whereas of course both versions are effectively ordered, and deterministic means ... deterministic.

This should be fine. I don't think there would be anything wrong with the adding approach either. ( As a brief aside, since order does not in fact matter, I don't think we're actually getting " stronger mixing of the hashes" in any sense that matters. This is a bit tricky to prove, I think it can be proven for an ideal hash function, but obviously our hash function is not ideal ... could there exist a hash function which is strong enough when not adding together, but somehow breaks down when adding? My guess is no, but that would be quite a CS paper to prove. But anyway, we assume in other places that summing hashes is fine. ).

But I don't think there's any reason to change this code, other than your note about considering implications of TaskCollaboration stuff.

johnhaddon · 2023-11-02T14:12:21Z

Note to self : before merging this, you need to consider the implications of the TaskCollaboration fixes that are in progress. I suspect there is a special case that needs to be handled when two CollectScenes are chained and both have an empty set of rootNames.

Since I wrote this we merged a superior fix for the TaskCollaboration-hash-aliasing problem, so I think this PR is good to go. I've rebased read for merging to 1.3_maintenance, but otherwise the PR is unchanged.

This gives a 3x speedup in `CollectScenesTest.testSetPerformance`. Perhaps more interestingly, it also gives almost a 2x speedup in `SetQueryTest.testScaling()`, but almost half of that speedup is due to the change in hash cache policy _alone_. In `testScaling()`, the same set is required by every location in the scene, but we are visiting enough locations that the hash cache is under significant pressure. By moving `out.set` to the TaskCollaboration policy, the set hash is stored in the shared central cache, from which it is exceedingly unlikely to be evicted (because per-location hashes are not stored in the global cache).

danieldresser-ie · 2023-11-03T01:06:01Z

LGTM

johnhaddon requested a review from danieldresser-ie November 14, 2022 15:40

johnhaddon self-assigned this Nov 14, 2022

johnhaddon force-pushed the collectScenesSetThreading branch from 4557213 to 2982f2e Compare November 2, 2023 14:10

johnhaddon changed the base branch from main to 1.3_maintenance November 2, 2023 14:12

johnhaddon force-pushed the collectScenesSetThreading branch from 2982f2e to 677ca05 Compare November 2, 2023 17:22

johnhaddon merged commit f7b163c into GafferHQ:1.3_maintenance Nov 3, 2023
4 checks passed

johnhaddon deleted the collectScenesSetThreading branch November 8, 2023 11:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CollectScenes : Multithread `hashSet()` and `computeSet()` #4967

CollectScenes : Multithread `hashSet()` and `computeSet()` #4967

johnhaddon commented Nov 14, 2022

danieldresser-ie commented Nov 21, 2022

johnhaddon commented Nov 21, 2022

johnhaddon commented Nov 23, 2022 •

edited

Loading

danieldresser-ie commented Nov 23, 2022

johnhaddon commented Nov 2, 2023

danieldresser-ie commented Nov 3, 2023

CollectScenes : Multithread hashSet() and computeSet() #4967

CollectScenes : Multithread hashSet() and computeSet() #4967

Conversation

johnhaddon commented Nov 14, 2022

danieldresser-ie commented Nov 21, 2022

johnhaddon commented Nov 21, 2022

johnhaddon commented Nov 23, 2022 • edited Loading

danieldresser-ie commented Nov 23, 2022

johnhaddon commented Nov 2, 2023

danieldresser-ie commented Nov 3, 2023

CollectScenes : Multithread `hashSet()` and `computeSet()` #4967

CollectScenes : Multithread `hashSet()` and `computeSet()` #4967

johnhaddon commented Nov 23, 2022 •

edited

Loading