-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for "rare" divide by zero problem #28
Comments
Cheers @illdopejake. I've come across a similar error. @LeonAksman and I added the hacky fix, I believe. Yours could be This is somewhat unsatisfying since the error seems to be deeper than this — given that I don't currently have bandwidth to dig into _calculate_likelihood_stage(), but would happily try to help someone if they have a go first. |
Hacky fix extended to other lines in cec24a2. Tests still pass, just FYI. I agree the error may be deeper than a numerical stability issue...Potentially a good candidate for something to look at in a pySuStaIn hackathon. |
Dear SuStaIn heroes: Having weird bugs again, maybe related to the above? This time I am using an old dataset that the original SuStaIn worked fine with, but am here using the newest version of SuStaIn. I had to change several lines to get it to work, and I really had no idea what I was doing, but I wanted to bring it to your attention. First, I am getting a bunch more divide by zero warnings, specifically in lines 555, 556 and 559 of
Next, I was getting a dimension error around lines 577 and lines 584. The structure of the code seems to anticipate this:
The problem is, the exception was not getting caught. To get around this, I changed the
I then had to do something kind of similar a few lines below that at line 584:
Altogether, these completely uninformed and pathetic hacks seemed to "solve" the issue -- SuStaIn ran with no issue after that. But they are perhaps suggestive of some other little gotchas lingering in the code, and I have no idea why these issues occurred. I'm wondering if the reason I keep running into these issues is because I'm using data with really large Z scores. Because PET is rather sensitive, it's not uncommon to get Z-scores above 30 for example when normalizing to controls. Could that perhaps explain what's going on? Do ya'll ever run tests/simulations with big z-scores like that? Thanks friends! --Jake |
Hey @illdopejake — thanks as ever for your detailed feedback. I reckon you've hit the nail on the head with the high-z-scores as I've had similar weirdness to you. Usually I find that the I believe that there aren't any tests/simulations on this. Being more of a |
I'm on the case! My guess would be that it's a precision problem that causes the likelihood to go to zero for z-scores that are very far from the z-scores that have been included in z_vals and z_max, but will let you know when I get to the bottom of it. |
Hi all, I've found the source and fix for this error -- numpy runs out of memory in float64 and returns 0 if the argument of The fix is to ensure I can't comment on why p_perm_k is becoming so negative and what that indicates, maybe the devs will have some idea? I personally usually ran into this error in later stages of the solve (i.e. on cluster 4 of 6, never usually on the first one or two splits). I will post my fix below of my entire working The fix is in ensuring
Let me know if this fixes things on your end, as it resolved the issue for me. Thanks! |
Dear SuStaIn friends,
I had encountered an error on a number of occasions when using SuStaIn on different datasets. The error itself looked something like this:
As it turns out, this is caused by a divide by zero problem during the normalization of p_perm_k. This NaN then propagates forward a bit and doesn't turn up as an error until line 324, as shown. This is not itself necessarily caused by any outlying "bad values" (e.g. NaNs) in the original dataset, so it's quite hard (impossible?) to detect before running SuStaIn and getting the error.
This is apparently a known issue, as the following comment exists on line 333 of ZScoreSustain:
#adding 1e-250 fixes divide by zero problem that happens rarely
A few lines later at 335, the "corrected" line occurs:
p_perm_k_norm = p_perm_k_weighted / np.sum(p_perm_k_weighted + 1e-250, axis=(1, 2), keepdims=True)
However, at least in my case, the offending divide by zero problem occurred earlier. Note that the fix (ln 335) occurs before the error in my traceback (ln 324). Instead, the divide by zero problem occurs for me at line 238, which is incidentally the same calculation:
p_perm_k_norm = p_perm_k_weighted / np.sum(p_perm_k_weighted, axis=(1,2), keepdims=True)
By once again adding the "corrected" line, the problem is surmounted and I no longer get the error. I'm not sure how rare this issue really is, because this is maybe the third time I've encountered it (on different datasets). Requesting a patch to fix it, pretty please!
Thanks as always for this incredible software!! <3 <3 <3
The text was updated successfully, but these errors were encountered: