Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CollectScenes : Multithread hashSet() and computeSet() #4967

Merged

Conversation

johnhaddon
Copy link
Member

This gives a 3x speedup in CollectScenesTest.testSetPerformance.

Perhaps more interestingly, it also gives almost a 2x speedup in SetQueryTest.testScaling(), but almost half of that speedup is due to the change in hash cache policy alone. In testScaling(), the same set is required by every location in the scene, but we are visiting enough locations that the hash cache is under significant pressure. By moving out.set to the TaskCollaboration policy, the set hash is stored in the shared central cache, from which it is exceedingly unlikely to be evicted (because per-location hashes are not stored in the global cache).

@johnhaddon johnhaddon self-assigned this Nov 14, 2022
@danieldresser-ie
Copy link
Contributor

The one thing I don't like about this is that the hash now depends on how tbb chooses to split the range ... is there something that gives us confidence that this will never change?

Do we actually need to do this in an ordered way? The fact that we use a non-deterministic parallel reduce for the compute suggests that an element is probably uniquely identified by the value of root, and if we hashed the 3 things we're hashing for each element, that would probably give us something that we could just sum, without caring about order? And then we could use a non-deterministic reduce for the hash as well, and the hash wouldn't be dependent on how many chunks it's computed in?

@johnhaddon
Copy link
Member Author

The one thing I don't like about this is that the hash now depends on how tbb chooses to split the range ... is there something that gives us confidence that this will never change?

The docs here provide enough confidence for me :

https://spec.oneapi.io/versions/1.0-rev-1/elements/oneTBB/source/algorithms/functions/parallel_deterministic_reduce_func.html

Is there anything there that concerns you?

Do we actually need to do this in an ordered way?

We don't need to - the sum-of-hashes approach would work here, and I did consider it. But since we know our RootRange up front anyway, deterministic_reduce gives us a simpler implementation with stronger mixing of the hashes. Let me turn the question on its head : what benefit would the sum-of-hashes approach provide here? For me, the only benefit of sum-of-hashes is that it works when you don't know the range upfront, like when doing parallelProcessLocations().

@johnhaddon
Copy link
Member Author

johnhaddon commented Nov 23, 2022

Note to self : before merging this, you need to consider the implications of the TaskCollaboration fixes that are in progress. I suspect there is a special case that needs to be handled when two CollectScenes are chained and both have an empty set of rootNames.

@danieldresser-ie
Copy link
Contributor

Oh, sorry, I wasn't thinking too clearly, and somehow thought deterministic_reduce meant ordered, whereas of course both versions are effectively ordered, and deterministic means ... deterministic.

This should be fine. I don't think there would be anything wrong with the adding approach either. ( As a brief aside, since order does not in fact matter, I don't think we're actually getting " stronger mixing of the hashes" in any sense that matters. This is a bit tricky to prove, I think it can be proven for an ideal hash function, but obviously our hash function is not ideal ... could there exist a hash function which is strong enough when not adding together, but somehow breaks down when adding? My guess is no, but that would be quite a CS paper to prove. But anyway, we assume in other places that summing hashes is fine. ).

But I don't think there's any reason to change this code, other than your note about considering implications of TaskCollaboration stuff.

@johnhaddon johnhaddon force-pushed the collectScenesSetThreading branch from 4557213 to 2982f2e Compare November 2, 2023 14:10
@johnhaddon
Copy link
Member Author

Note to self : before merging this, you need to consider the implications of the TaskCollaboration fixes that are in progress. I suspect there is a special case that needs to be handled when two CollectScenes are chained and both have an empty set of rootNames.

Since I wrote this we merged a superior fix for the TaskCollaboration-hash-aliasing problem, so I think this PR is good to go. I've rebased read for merging to 1.3_maintenance, but otherwise the PR is unchanged.

@johnhaddon johnhaddon changed the base branch from main to 1.3_maintenance November 2, 2023 14:12
This gives a 3x speedup in `CollectScenesTest.testSetPerformance`.

Perhaps more interestingly, it also gives almost a 2x speedup in `SetQueryTest.testScaling()`, but almost half of that speedup is due to the change in hash cache policy _alone_. In `testScaling()`, the same set is required by every location in the scene, but we are visiting enough locations that the hash cache is under significant pressure. By moving `out.set` to the TaskCollaboration policy, the set hash is stored in the shared central cache, from which it is exceedingly unlikely to be evicted (because per-location hashes are not stored in the global cache).
@johnhaddon johnhaddon force-pushed the collectScenesSetThreading branch from 2982f2e to 677ca05 Compare November 2, 2023 17:22
@danieldresser-ie
Copy link
Contributor

LGTM

@johnhaddon johnhaddon merged commit f7b163c into GafferHQ:1.3_maintenance Nov 3, 2023
4 checks passed
@johnhaddon johnhaddon deleted the collectScenesSetThreading branch November 8, 2023 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants