-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix initialization of doubles in VerticalProfileMod #6649
Fix initialization of doubles in VerticalProfileMod #6649
Conversation
I can't verifiy that this indeed fixes the bug on mappy. Someone with access to mappy would have to test it a few times due to the intermittent nature of the failure. But this code is bugged, so it makes sense to me merge it to see if it's fixed using CDash. |
@peterdschwartz you can merge this to next today. |
) Variables that are r8 were initialized as single-precision, potentially causing inconsistent failures with the sums not adding to 1.0_r8. Also, fixed a syntax error in CH4Mod for spval. Fixes #6650 [BFB]
merged to next. Tested [BFB] with intel and gnu compilers on pm-cpu |
@rljacob I noticed that ERS.r05_r05.ICNPRDCTCBC.mappy_gnu.elm-cbudget is also failing occasionally with same VerticalProfileMod sum error message. Since that test still failed overnight on next with this PR, I re-checked and noticed I missed a couple of if-branches that have single-precision (as below). I made a new commit and will re-merge today.
Test error message:
|
Should the 3.0.1 tag wait for this? |
@rljacob I'm not sure. The code is more correct this way, but practically, it seems to only manifest on one machine/compiler and we can't guarantee that this will fix that specifically. |
Merge new commit Fixes #6650 [BFB]
re-merged to next |
Was this supposed to help with ERS.r05_r05.IELM.mappy_gnu.elm-V2_ELM_MOSART_features ? It flipped from DIFF yesterday to FAIL today. |
Yes, this PR was aiming at fixing random fails for The randomness, only affecting one machine/compiler, and happening to two different tests is what makes me think it's some floating-point issue. Been trying to figure out the next step this morning and It's difficult to go more in-depth without being able to reproduce this behavior on pm-cpu. I attempted to run a debug version of the mosart test
Potential next step: There are several checks to avoid dividing-by-zero like |
@jgfouca may be able to help speed up the development since he could run a test on your branch on mappy. Note that I reset next so this PR is no longer on it. |
@rljacob , @peterdschwartz , I am happy to kick off some runs on mappy. I assume I will need to checkout this branch. How is this for a test:
I assume we are looking for FAILs, not DIFFs, so no need for baseline comparison. |
@peterdschwartz want do you want to do with this PR? It still fixes things that should be fixed even if it doesn't fix #6650 |
Right, I will push one more commit that just adds useful information to the error message and start merging it tomorrow |
@peterdschwartz you can merge this to master. |
merged to master |
Variables that are
r8
were initialized as single-precision, potentially causing inconsistent failures with the sums not adding to1.0_r8
.Also, fixed a syntax error in CH4Mod for
spval
.[BFB]