-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ccdMPRPenetration() bug, wrong dir penetration depth #71
Comments
Bug in ccdVec3PointTriDist2 |
Bug there:
If we replace this condition with: issue is solved. |
Just as a reality check, these are the lines of code to which you are referring, yes? https://github.com/danfis/libccd/blob/master/src/vec3.c#L176-L180 The solution will have to be more nuanced than just setting that entire branch of logic to The true bug is going to be more subtle. It's going to be related to the tolerances built into the successful projection test and what happens in the alternate branch when that threshold is crossed. Ideally, the two branches should, at the limit, produce the same answer as we approach the threshold; but they are apparently not doing so. Looking at your video, I hypothesize that what we're seeing is that when the "has projected" test fails and we evaluate the distance from the point P to the nearest point on each triangle edge Qi, the closest pair (P, Qi)'s distance is essentially epsilon in length which means the direction of the vector between them is largely defined by rounding error. Ultimately, if you could provide some code that reproduces this (as done in the issue you linked to), it would aid considerably in resolving this. (Although, it clearly increases the burden on you, it definitely increases the likelihood of the issue being addressed). |
I understand that this is not the best solution, but I needed at least some solution that would work. I was also surprised that it works, but it is. I checked on different models - boxes, cylinders, pyramids, moved them from different sides and from different angles, this is my strange solution that really works well. I saw some problems with the direction of the penetration vector only when using spheres.
I also think that this is due to calculation errors, but it is definitely not related to float precision, since I tried the option with double too.
Which part of the code do you mean? You won’t read thousands of lines of code. I am completely sure that the problem is in the ccdVec3PointTriDist2 function, which is called from the findPenetr function, which is called from the ccdMPRPenetration function. Most likely the problem is exactly what you wrote "I hypothesize that what we're seeing is that when the" has projected "test fails and we evaluate the distance from the point P to the nearest point on each triangle edge Qi, the closest pair (P, Qi) 's distance is essentially epsilon in length which means the direction of the vector between them is largely defined by rounding error. " This is the only logical reason for the error. |
You'd have to check various configurations as well. You could change numerous models and not run into the situation where this change would explode if you just don't configure them correctly.
I didn't say it was related to "float precision". I said it was related to tolerances. You'll note that One of the factors that is important, from a geometry sense, is the scale of the problem. The default epsilon value works best when all of the parameters are approximately unit magnitude. Ultimately, epsilon should really scale with respect to the size of the problem (otherwise, for large triangles, the absolute value of epsilon falls below rounding error, and for small triangles, the absolute value becomes a larger portion of the valid numerical space). I agree with you, the complete diagnosis of the code is complex. And the solution more so. I hadn't intended to suggest you should implement the solution. I was merely documenting why your apparent solution is not a general solution that could/should be included in a PR. Resolving what appears to be a very real bug is going to be more complex than just setting the test to
I mean a piece of reproducible code that would allow someone who has no access to your code base the ability to see the bug. That would the single greatest contribution contained in this issue. It makes the bug definitive and empowers someone to address it. In an ideal world, you should be able to reproduce this in just a few lines of code.
The challenge is finding the triangle and point that is causing you problems because it is, ultimately, being generated inside libccd. Still, build enough things in debug build and you should be able to capture those parameters. |
Thx a lot, I will try it. |
ccdVec3PointTriDist2 The function incorrectly considers the distance from the zero point to the triangle with the following coordinates: A(1.650970, -0.102323, -2.994949); The function returns the distance to the triangle equal to 0.143770, but the correct distance is 0.102323. A couple of pictures for clarity: I myself will continue to search for errors in this function, but I will be grateful for any help. |
I tried to change the value of CCD_EPS in different directions and in different orders, but this did not produce any result. |
I began to research this comparison on the correctness of work: I found out that if the function does not work properly, the condition is not met Is it possible that a variable "t" is being calculated incorrectly? |
This is a great effort; you've got the right idea. However, I've looked at your example and I think you have an error. Here's the scenario you've created:
Figure 1: "Bad" triangle drawn on the y = -0.102323 plane. The picture is merely illustrative showing the relative position of the triangle vertices. For the "erroneous" triangle, you assume the correct distance is 0.102323 which implies the origin projects to a point inside the second triangle. However, that appears not to be the case. Consider the vector from A to B (AB). To be inside the triangle, the vector from A to O (AO) must lie on AB's "left" side. By my calculations, it doesn't. Here's some python code to illustrate that. import numpy as np
def det(a, b):
return a[0] * b[1] - a[1] * b[0]
# We're going to turn this into a 2D problem; because the y-value doesn't matter
# for testing the *projection* of the origin on the y = -0.102323 plane.
A = np.array((1.650970, -2.994949))
B = np.array((-1.613953, 2.534958))
p_AB = B - A
p_AO = -A
# for origin to project onto the triangle, this value must be <= 0.
det(p_AO, p_AB)
# evaluates to 0.6485673141370007 In fact, if I take this a step further, the projection of the origin lies 0.100994 units outside the triangle. And that means the origin is 0.14377027 units away from the triangle. Unfortunately, this means the "error" condition isn't really an error condition. We'll have to keep digging. |
Stop, sorry, I'm a little confused. Let us take one moment to uttle. I correctly understood that you are claiming that the point
? |
I was worried that might be a bit opaque. Sorry about that. I'm happy to walk through it.
Correct. Everything else in the message was about showing that to be the case. |
The function ccdVec3PointTriDist2 thinks it belongs. I have to double-check this with another simple algorithm. Thanks for the help, I will write about the result later. |
I double-checked, you are absolutely right. So the error is at the beginning of the function, in this range: https://github.com/danfis/libccd/blob/master/src/vec3.c#L142-L180 |
Are you sure about that? It's reporting the correct distance. It seems highly unlikely that it can simultaneously classify it as projecting onto the triangle and compute the correct distance. Admittedly, I haven't run the code to analyze what it's doing. But, I'd be quite surprised if both can happen at the same time. For the record, a few questions:
|
Oh, it seems I'm a little confused. Sorry, I didn’t get enough sleep today. |
You are right again, I was mistaken. Then it turns out that the issue is in these lines: https://github.com/danfis/libccd/blob/master/src/vec3.c#L198-L214 |
With witnesses... probably witness is not equal to zero.. |
Thank you very much, you helped me a lot in my search. Now I started to suspect a function __ccdVec3PointSegmentDist2. |
I'm glad to help. And I'm impressed with how you're tackling this. It might be worth taking a step back and make sure we're investigating the right thing. As I look at the original post and the video, what we're seeing is a fluctuation in penetration direction. Right now you're focused on penetration depth. It's distinctly possible that even as the direction vector in your video fluctuates discontinuously, you're still getting reasonably continuous depth measurements. So, focusing exclusively on depth may take you down the wrong path. So, we might need to revisit the core issue with the following questions:
|
My program takes a visualized direction vector from "dir":
Line 117 in 7931e76
ccdMPRPenetration takes "dir" vector from "dir":
Line 144 in 7931e76
ccd_real_t ccdVec3PointTriDist2(const ccd_vec3_t *P,
Please look at this: I added "printf" to ccdVec3PointTriDist2 to get the information here in this place: Lines 198 to 201 in 7931e76
I got these numbers in the console: They show that the distance between segment AB and the zero point is 0.143770. And the distance returned by the function ccdVec3PointSegmentDist2 is almost one and a half times greater, how can this be if the function ccdVec3PointSegmentDist2 works correctly? Therefore, I conclude that the function ccdVec3PointSegmentDist2 returns the wrong distance. Maybe I’ve made a mistake again somewhere, if you see where, please show a mistake in the reasoning. PS: I am very interested to know what distance you will get. PSS: |
Strange, I checked with another algorithm, and it turned out that ccdVec3PointSegmentDist2 returns the correct value. |
Another video with the same problem, now a box-box for simplicity, with depth printing and normal penetration. |
Here's my suggestion for tracking down this bug:
|
I am very grateful to you that you continue to help me. Your help is very much needed.
Already done.
Ок. I think this code will provide it:
I'll do it, but it'll probably take two or three days. I added some data visualization. Now the portal is drawn with red lines. The yellow triangle is the same ABC triangle to which we are looking for the distance from the zero point. Lines 325 to 329 in 7931e76
Zero point - small red sphere, contact position. In this video, the top box barely touches the bottom one, the penetration depth is almost zero. |
The new visualization is wonderful! You can see the bad behavior is right on the very edge of the triangle. That suggests to me that it's related to the tolerance and we end up in a branch where the assumption is incorrect by the amount of the tolerance. I'll think about it. |
I will think about it too. And tomorrow I will print all values penetration normal (pdir, witness) throughout the function. |
Wow! That's an amazing visualization of MPR! |
Thank you, I'm glad you liked it. :) |
I have traced how all the variables change that can affect the calculation of the normal vector. I am assuming the portal is building correctly and the problem is not with the portal. In the screenshots, the values are written in red opposite the lines. Part1 : ========================================================================== Lines 308 to 329 in 7931e76
Part2 : ========================================================================== Lines 137 to 165 in 7931e76
Part3 : ========================================================================== Lines 167 to 180 in 7931e76
Part4 : ========================================================================== Lines 73 to 128 in 7931e76
We found out that the projection of the zero point belongs to the triangle is considered correct. It turns out that the function ccdVec3PointSegmentDist2 incorrectly calculates the penetration vector? Or not? I suppose a good solution is to make it so that when a point lies on an edge of a triangle, the function thinks that it belongs to the triangle. But I can't do it. I also suspect that if you do this, there will be problems in other places. |
This is sterling work. I'm going to go over the data/values you've got here. I'd like to confirm that you ran this calculations with 32-bit floats, yes? |
Yes, 32-bit float. |
I suspect that all such isssues happen when the triangle is rectangular and the point touches its hypotenuse. |
I'd generalize the statement a bit more. Essentially, it's when the MPR evaluation gets very close to an edge. If the edge were axis-aligned, it would be unlikely to cause problems. But as it runs across the axes, you're more likely to encounter rounding error that can mis-classify inside/outside. In other words, it need not be a right triangle. I have a new hypothesis watching your video. This is based on my rusty mental model of the MPR algorithm. Essentially, we're looking for the triangle through which we can see the origin. If the origin isn't visible through the portal (aka project onto the triangle) we "refine" the portal. Once we've picked the portal, we then do some further calculations. It could be that the code that decides if we have the "right" portal has different tolerances (and critical numerical boundaries) than the subsequent distance calculation. Specifically, the portal classifier can think it's inside a triangle but when doing the actual math, we're not. Being slightly outside of the portal would give us those normal deviations you're experience just at the moment when you're crossing a portal boundary. |
I thought about all this and decided that the best way is to always assume that the penetration vector coincides with the normal of the triangle. This is almost always the case, except in cases of smooth shapes such as a sphere, then the triangle becomes a point and problems arise. But for a sphere, we can easily find the normal vector of penetration ourselves, knowing the position of the center of the sphere and the point of penetration. |
|
Hello.
ccdMPRPenetration() return wrong dir penetration depth.
Please, see this video, bright green stick - it's dir penetration:
https://youtu.be/vN_u3Ig1C2c
I think this is due to this problem: ccdVec3PointTriDist2 computes different distance with / without witness points. #55
Really need help.. I hope the project is not abandoned.
The text was updated successfully, but these errors were encountered: