Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DuplicateTest : Test that uses Duplicate to expose exception issue #5528

Conversation

danieldresser-ie
Copy link
Contributor

This feels exceptionally weird - this is a pretty simple graph, with an obvious error in a Python expression that should lead to an exception, but in my testing, about half the time, non-deterministically, this exception gets lost.

I was originally worried this could be related to the recent work on acquireCollaborativeResult, because when the exception got lost, the exception I got instead was:

    GafferSceneTest.traverseScene( duplicate["out"] )
AssertionError: "division by zero" does not match "CustomAttributes.expression.__out.p0 : Process::acquireCollaborativeResult : No result found"

However, testing on Gaffer 1.3.5.0, I see the same issue with the exception non-deterministically getting lost, except with this message instead:

line 340, in testUpstreamError
    GafferSceneTest.traverseScene( duplicate["out"] )
AssertionError: "division by zero" does not match "CustomAttributes.expression.__out.p0 : CustomAttributes.expression.__execute : getValueInternal() didn't return expected type (wanted ObjectVector but got nullptr). Is the hash being computed correctly?"

This message is a bit more descriptive - but it still doesn't make a huge amount of sense. If there is a hash bug, I'm not sure where it is - maybe in CustomAttributes? ( I was able to rule out Instancer ). Even if there is a bug in CustomAttributes, it's a bit alarming that it would manifest as a non-deterministic failure like this.

I don't have any solution yet, but putting up a PR with the test seemed like the quickest way to share the code ( and get a result on CI, which will hopefully reproduce what I'm seeing on my local machine ).

@johnhaddon
Copy link
Member

This feels exceptionally weird - this is a pretty simple graph, with an obvious error in a Python expression that should lead to an exception, but in my testing, about half the time, non-deterministically, this exception gets lost.

Good catch - thanks for the clear test case for reproducing it.

However, testing on Gaffer 1.3.5.0, I see the same issue with the exception non-deterministically getting lost

Phew!

This message is a bit more descriptive - but it still doesn't make a huge amount of sense. If there is a hash bug, I'm not sure where it is

The message about the hash bug is a red herring - it only makes sense when we've retrieved an unexpected type, not nullptr. The new error message is actually more accurate - we waited on a collaboration, but when it returned, it had no result. The problem is that we're now in a race - either the initiator of the collaboration throws the true exception first, or the collaborator throws the "No result found" exception. I've opened #5529 to fix this.

@johnhaddon johnhaddon closed this Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants