-
Notifications
You must be signed in to change notification settings - Fork 376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix split-explicit barotropic velocity accumulator #6041
Conversation
We need to run tests to evaluate if this fix is in fact non-climate changing. Labels will be updated when we know. |
@cbegeman thanks for this fix. I agree that the extents on the two loops you identify in #6040 should be the same. Also, it is best to use the smaller extents when possible (Owned rather than All). What I don't understand is why the original problem doesn't make the partition test fail. Perhaps it is wetting and drying that reveals the problem, and with WD off it never caused trouble because the halo was not used before a halo update. I'm testing this now. |
Actually, this fix doesn't make sense to me. If you are using the halo edges with wetting/drying for The fact that the incorrect value on the halo causes a problem won't be solved by the current fix, because the value on the halo is still incorrect. |
@mark-petersen Thanks for taking a look and for those thoughts. What doesn't make sense to me is that I would think that if those values on the halo were not being used, then the solution would be identical whether the extent of the first loop was |
@cbegeman, could you do a partition test with W/D on before and after this change? The partition test will tell you if there is any halo mismatch. If the compass partition tests are not right to produce W/D behavior, you can just do your own with 4 vs 8 or whatever with a W/D test. Thanks. I still suspect that changing the second loop to |
@mark-petersen The I've created another branch where I change the second loop to All of these tests are without wetting and drying. I don't think it's necessary to bring in that added level of complexity. |
@cbegeman I think you changed the wrong loop. You meant to change this one, right? Not line 861. E3SM/components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration_split.F Lines 1484 to 1488 in abe445e
|
@mark-petersen Thanks for catching that. Yes, I had the loop lines right in the issue. I will push the change now and retest. |
2fa6d8c
to
ff06b0d
Compare
@mark-petersen For me, the comments above still hold with the updated branch. It would be good to have you confirm though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running the nightly suite with gnu debug on chrysalis, I was able to confirm @cbegeman's results. With this PR, all decomp tests pass, and this is bfb identical between this PR and if both loops are changed to nEdgesAll
in lines 1484 and 1512. The bfb comparison show that there is no harm done to the solution by tightening the halo computation to nEdgesOwned
, and there is a small benefit to computing fewer flops.
@mark-petersen Thank you so much for your review and testing! @jonbob This is now ready for your testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving by code inspection and based on @cbegeman and @mark-petersen's testing.
@vanroekel should this be in v3? |
I certainly think it should be. |
Yes, this should be in V3. |
Sorry for not responding sooner. @cbegeman and @mark-petersen could you summarize what the expected results here are? The PR is labeled non-BFB, but @mark-petersen approval comment seem to suggest there is no difference in the loop limit? I think it is something that needs to be included, but would appreciate a bit of clarity to alleviate my confusion. Related, any expectation on the climate impact? How far from nonBFB are the changes? @cbegeman related question does AB2 not have this issue? (Never Mind - I see that the change is there already) Sorry for all the questions. But from what I understand from this discussion and issue this is a bug that should be fixed for v3. |
@vanroekel No problem. Here's an overview of the results of the standalone testing: Loop 1 is the accumulator loop E3SM/components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration_split.F Lines 1484 to 1488 in abe445e
Loop 2 is the division loop E3SM/components/mpas-ocean/src/mode_forward/mpas_ocn_time_integration_split.F Lines 1512 to 1520 in abe445e
A. Master: Loop 1 is A is non-BFB with B and C. Here we go with B, the least computations. AB2 goes with C, the more conservative approach https://github.com/E3SM-Project/E3SM/pull/5989/files#r1380567559. |
@cbegeman -- I ran a 10-year B-case baseline, as well as a comparison test with this PR. They are apparently BFB. Does this surprise you? I was running with the ECwISC30to60E3r2 mesh and had
in user_nl_mpaso |
@jonbob, Thanks for running that test. I think it's possible that these changes are compiler specific. @mark-petersen and I were actually surprised that the changes were non-BFB in any configuration, so it's totally fine that they are BFB. |
Fix split-explicit barotropic velocity accumulator This PR includes a bugfix for the loop limits on the barotropic velocity (normalBarotropicVelocityNew) when its values are accumulated over barotropic subcycles. The loop should be over nEdgesOwned for consistency with the loop which divides the barotropic velocity by the number of subcycles. Fixes #6040 [BFB] for all current E3SM tests, changes to a non-default option
passes:
merged to next |
merged to master |
Thanks @jonbob! |
This PR includes a bugfix for the loop limits on the barotropic velocity (
normalBarotropicVelocityNew
) when its values are accumulated over barotropic subcycles. The loop should be overnEdgesOwned
for consistency with the loop which divides the barotropic velocity by the number of subcycles.Fixes #6040
[BFB] for all current E3SM tests, changes to a non-default option