-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop running 0th time step #2084
base: master
Are you sure you want to change the base?
Conversation
Fix unit tests for nstep starting at 1 instead of 0
Thanks a lot for this work, @olyson ! Are there any pieces of this that you feel warrant a careful review - i.e., that you'd like someone else to think about (if so, I would probably ask someone else to do that so that I can stay focused on some ESMF work for now)- or are you comfortable enough with the changes you made that a quick look-over should be sufficient? Can you please go ahead and remove the !KO comments when you get a chance? |
From discussion today: @olyson feels reasonably confident that these changes are correct. @samsrabin would like us to run ctsm_sci - or at least his new test - to make sure that this change doesn't break that. |
We agreed to leave my comments in for now to help @ekluzek perform his review and understand the reasoning behind the changes, then remove them after that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just skimmed through some pieces of this (not a full review, but just some selective checks). Thank you very much for your careful work on this @olyson and especially for the comments describing your reasoning!
I did a quick check that the changes here are consistent with some specific things I raised in #925 . One question I had from skimming back through the comments in #925 is if you remember if you did the searches through the code that I mentioned in #925 (comment):
I think when I did my earlier search I looked for nstep == 0 but didn't look for similar things like nstep /= 0, nstep > 0, nstep >= 1, etc. We'll need to check for uses like that.
Full aux_clm test suite results (output of ./parse_cime.cs.status tests_0803-171434ch/cs.status -s): Test summary The pending test is EXPECTED as is the failed test. I'm not familiar with those two tests. Maybe someone will know why those might pass? Otherwise, I will look into it. |
Thanks @olyson ! It makes sense that the FUNIT test doesn't show BASELINE comparison failures: that is a weird test that just runs the unit tests, so doesn't actually do baseline comparisons. As for the PFS test: I think it might not actually produce any history files (by design)... so again, it makes sense that there are no BASELINE failures for that since there is nothing to compare. |
Yep, I don't see any history files for the PFS test, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main thing I thought here is that the commented out code and the !KO type comments can be removed now.
There's also comments about things like "I don't think this is needed anymore", or "I think this is still needed". I think it would be worth doing some testing to check if that's true. We can just make the code as it is now as the baseline to compare to and make sure it gives identical answers with some of these things changed. @slevis-lmwg and @olyson thoughts on doing that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, @slevis-lmwg already took care of the !KO comments. So the only thing for my review is if we should validate some of those questions about "I think this is still needed...", or "I think this isn't needed"...
Keith recommends:
|
src/main/accumulMod.F90
Outdated
@@ -595,7 +595,7 @@ subroutine update_accum_field_timeavg(this, level, nstep, field) | |||
|
|||
do k = begi,endi | |||
effective_nstep = nstep - this%ndays_reset_shifted(k,level) | |||
time_to_reset = (mod(effective_nstep,this%period) == 1 .or. this%period == 1) .and. effective_nstep /= 0 | |||
time_to_reset = mod(effective_nstep,this%period) == 1 .or. this%period == 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samsrabin this is how I handled the new version of the code, as I mentioned in my question to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I'm pretty sure this change should be reverted.
effective_nstep
refers not to the number of timesteps since run start/restart, but rather to the number of timesteps since the accumulator field was initialized/reset.
However, before you revert the change, could you check whether the unit tests in test_accumul.pf
pass? If so, I think that indicates a gap in the testing coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @samsrabin I will revert the change.
I need to request a tutorial on running test_accumul.pf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it's just run as part of the unit tests. So if those are passing then I need to refine them.
…e_bounds-ssr Handle "instantaneous files" in RXCROPMATURITY analysis script
derecho testing
izumi testing
|
A mosart nvhpc test was introduced in mosart1.1.03 and has been failing ever since in the build phase: DPFCT_ROCK is different in the base.cprnc.out file of
Nothing else is different, so I'm guessing this constant needs a correction at initialization. |
Following up on the DPFCT_ROCK difference in the base.cprnc.out file.
239589c239589
< 0.486517470277726, 0.435177136876831, 0.411865850359667,
---
> 0.463483699076243, 0.435177136876831, 0.411865850359667, Same diff when I compare ...asc.base to ...asc.rest (because the ...asc and ...asc.rest files are the same) |
In the latest #2052 aux_clm where I included this branch so as to perform cumulative testing, the same test passes from what I can tell, and I posted my interpretation in that PR. I'm starting manual restart testing here to try and troubleshoot, and to assess whether I'm misinterpreting this test failure:
I have not reproduced the problem, yet. Summary of my understanding from the test failure this far:
|
Description of changes
Stop running 0th time step.
Specific notes
See discussion in #925
DONE Note that "!KO" comments are still to be removed.
Contributors other than yourself, if any: @billsacks
CTSM Issues Fixed (include github issue #): #925
Are answers expected to change (and if so in what way)? Yes, more than roundoff.
Any User Interface Changes (namelist or namelist defaults changes)? No
Testing performed, if any:
See discussion in #925