Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ValuePlug::getValue() performance regression #5503

Merged
merged 2 commits into from
Oct 20, 2023

Conversation

johnhaddon
Copy link
Member

This fixes a performance regression introduced in 1.3.1.0 by 055f07d. That commit removed unnecessary overhead for any type-converting-inputs on plugs held by Python-derived nodes, but added equivalent overhead for the far more common case of unconnected inputs on Python-derived nodes. This had a noticeable impact on scene generation times (around 3x in the case I've been debugging).

@johnhaddon johnhaddon self-assigned this Oct 19, 2023
@johnhaddon johnhaddon marked this pull request as ready for review October 19, 2023 11:40
@danieldresser-ie
Copy link
Contributor

LGTM. I was feeling bad about not having caught this - but then you said that you actually had noted the problem, tested, and not noticed the case where it went bad, so I guess I shouldn't blame myself too much ...

This has become performance critical since 055f07d inadvertently started performing unnecessary `runTimeCast<ComputeNode>()` for _every single_ call to `ValuePlug::getValue()`. We'll be fixing that separately, but it's still worth optimising the cast.
Commit 055f07d moved the `computeNode` assignment before the fast path for plugs with static values. This caused significant slowdown when the node was implemented in Python, due to the overhead of the `runTimeCast()`. The excessive Python-specific overhead from this was fixed in the preceding commit, but the additional fix here still yields a 6% improvement in `testStaticNumericValuePerformance()` for the remaining C++-only case. This improvement will become even more significant soon, because I'm sitting on another optimisation which removes the majority of the remaining overhead for this case (the reference counting).

The assignment to `computeNode` within the `if` might be a bit subtle for the reader, but I tried various other ways of factoring it, and they all resulted in at least one more conditional block and some duplicated code. I think this is the clearest way of writing it, particularly with the new assertion that shows the intended result.
@johnhaddon johnhaddon merged commit dc9fdf8 into GafferHQ:1.3_maintenance Oct 20, 2023
@johnhaddon johnhaddon deleted the regressionFixes branch October 25, 2023 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants