-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add timestamp to rpointer files #2757
base: master
Are you sure you want to change the base?
Conversation
Thanks Jim. Can this go onto b4bdev, @ekluzek ? |
Thanks @jedwards4b. @wwieder yes this totally makes sense as something coming into b4b-dev. Since, it has backwards compatibility it doesn't need to be coordinated with other CESM tags or externals. So bringing it into b4b-dev and having it go into CTSM main-dev the next time a b4b-dev tag is made (in two weeks) makes a lot of sense. |
I've run into an issue here. The clm_timemgr reads its clock information from the restart file on restart - which makes it hard to read the clock to read the restart file. It's also not a requirement to get this from the restart file as the driver has already set the clock. |
463bf55
to
25efa68
Compare
25efa68
to
9f07cf9
Compare
I have tested with ERS.ne30pg3_t232.BLT1850.derecho_intel.allactive-defaultio |
We discussed this at the CTSM SE meeting this morning and decided it would be in our cesm3_0_beta04 tag, which fits with @jedwards4b's timeline. |
Update surface datasets, CN Matrix, CLM60: excess ice on, explicit A/C on, crop calendars, Sturm snow, Leung dust emissions, prigent roughness data Purpose and description of changes since ctsm5.2.005 ---------------------------------------------------- Bring in updates needed for the CESM3.0 science capability/functionality "chill". Most importantly bringing in: CN Matrix to speed up spinup for the BGC model, updated surface datasets, updated Leung 2023 dust emissions, explicit Air Conditioning for the Urban model, updates to crop calendars. For clm6_0 physics these options are now default turned on in addition to Sturm snow, and excess ice. Changes to CTSM Infrastructure: =============================== - manage_externals removed and replaced by git-fleximod - Ability to handle CAM7 in LND_TUNING_MODE Changes to CTSM Answers: ======================== Changes to defaults for clm6_0 physics: - Urban explicit A/C turned on - Snow thermal conductivity is now Sturm_1997 - New IC file for f09 1850 - New crop calendars - Dust emissions is now Leung_2023 - Excess ice is turned on - Updates to MEGAN for BVOC's - Updates to BGC fire method Changes for all physics versions: - Parameter files updated - FATES parameter file updated - Glacier region 1 is now undefined - Update in FATES transient Land use - Pass active glacier (CISM) runoff directly to river model (MOSART) - Add the option for using matrix for Carbon/Nitrogen BGC spinup New surface datasets: ===================== - With new surface datasets the following GLC fields have region "1" set to UNSET: glacier_region_behavior, glacier_region_melt_behavior, glacier_region_ice_runoff_behavior - Updates to allow creating transient landuse timeseries files going back to 1700. - Fix an important bug on soil fields that was there since ctsm5.2.0. This results in mksurfdata_esmf now giving identical answers with a change in number of processors, as it should. - Add in creation of ne0np4.POLARCAP.ne30x4 surface datasets. - Add version to the surface datasets. - Remove the --hires_pft option from mksurfdata_esmf as we don't have the datasets for it. - Remove VIC fields from surface datasets. New input datasets to mksurfdata_esmf: ====================================== - Updates in PFT/LAI/soil-color raw datasets (now from the TRENDY2024 timeseries that ends in 2023), as well as two fire datasets (AG fire, peatland), and the glacier behavior dataset.
Same as ctsm5.3.001 I made an accidental merge and reverted it.
We are going to do this as a standalone tag to master, so I'll rebase to master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jedwards4b this is great, thanks for getting this out there for us. There's some nice improvements I saw you add (only reading the rpointer file on masterproc and catching some typos) which is great.
There are some changes that are required, and some I think would be good to do as they should be easy. They are outlined in the code changes. Right now I'm planning on just doing those changes. Feel free to comment on any of it though.
The required change is to move the updates into lnd_comp_esmf.F90 for LILAC.
Thanks again for the PR.
@@ -1038,53 +1041,54 @@ subroutine ModelSetRunClock(gcomp, rc) | |||
call ESMF_LogWrite(subname//'setting alarms for ' // trim(name), ESMF_LOGMSG_INFO) | |||
|
|||
!---------------- | |||
! Restart alarm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jedwards4b here the order of setting the stop alarm and then restart alarm was switched. As far as I could see, there isn't a strict need to do this. I figured a preferred order might be stop first and then restart, maybe to be consistent elsewhere.
But, I wanted to make sure I wasn't missing anything. So is this a preferential change or one that's absolutely needed? Thanks in advance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually there is a requirement to do this. When you request to write the restart at the end of the run, you need to know when the end of run is so that you can set the restart alarm, by initializing the stop alarm first I have the information I need to set the restart alarm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent, thanks for the explanation that helps.
I'll add a comment about this then. And make sure the same is done in LILAC.
! Initialize start date from restart info | ||
|
||
start_date = TimeSetymd( rst_start_ymd, rst_start_tod, "start_date" ) | ||
! Check start date from restart info |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timemgr_spmdbcast and init_calendar calls above can also be removed, because this now requires timemgr_init to be called first. As such we should check that
timemgr_set == .true.
and abort if not.
! Initialize clock | ||
|
||
call init_clock( start_date, ref_date, curr_date) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the end also remove
if (masterproc) call timemgr_print()
As it's already done in the timemgr_init step previously. No reason to repeat.
|
||
!--------------------------------------------------------------------------------- | ||
! Restart the ESMF time manager using the synclock for ending date. | ||
! | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of this subroutine is now, just to do some checking, set a couple variables, and to do the advance.
Running aux_clm on Derecho I'm seeing tons of tests passing 199, with only 12 pending, but 23 failing. LILAC fails as I expected, but a bunch of ERI, the SSP tests, and one ERP, a few ERS, one REP, and a few SMS tests fail at the RUN phase. |
@ekluzek I haven't yet merged the cime PR that you will need for these tests, are you using the branch? |
I just merged it - try updating to cime6.1.47 |
Ahh, OK, thanks @jedwards4b! I'll update to that and see how it goes. |
@ekluzek How is the testing going? |
@jedwards4b I have a bunch of gnu tests with restarts on Derecho that fail early on in the driver. I think I've got the externals to what's needed (but let me know if you think I need to adjust something). For example this test fails: ERI_D_Ld9.f10_f10_mg37.I1850Clm45Bgc.derecho_gnu.clm-default
On Izumi, a few intel and gnu ER tests fail as well, but all of the nag tests fail due to a build issue. I'll look into what's going on there closer in a bit. If you could look at the issue for the Derecho gnu ER test above that would be great. Thanks in advance. |
Requires ESCOMP/CESM_share#59 now share1.0.21 @ekluzek would you restart testing after updating these three externals and let me know how it goes? Thanks |
@jedwards4b yep, I'm starting up testing now. By, the way shouldn't the share tagname be share1.1.6 rather than 1.0.21? |
Good point - I'll fix the share tag name, please use share1.1.6 |
More tests are working on Izumi now. But, several fails with these two issues below... 1.) ERP tests: But, a bunch of ERP tests fail at the build step with this:
However, even though it warns against this I can go into the test and run: ./case.build
./case.submit and it seems to work fine, even though it warns against doing that in the error message above. 2.) NEON tests: Also the NEON tests are now ALL failing because it can no longer find the NEON user-mod. It gives the following error: For example for the test: SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.izumi_nag.clm-NEON-MOAB--clm-PRISM 2024-12-09 18:34:41: Could not locate testmod 'NEON/MOAB' Those are tests that had been working for a very long time. |
On Derecho I have a ton, of ERI, ERP, SSP, and NEON fails:
Specific problems I looked at... 1.) ERS test for a specific year: ERS_D_Ld5_Mmpi-serial.1x1_mexicocityMEX.I1PtClm60SpRs.derecho_gnu.clm-CLM1PTStartDate atm.log:
The dated rpointer.cpl files ARE on disk though...
cesm.log:
med.log, drv.log, and lnd.log all look fine and don't report errors. However, the drv.log does seem to have the right year, as follows. So possibly there's a missing broadcast of year? drv.log:
2.) ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default
3.) ERP_D_Ld5.f10_f10_mg37.IHistClm60Sp.derecho_intel.clm-default Failure looks similar to above 4.) SMS_D.f10_f10_mg37.I2000Clm60BgcCrop.derecho_nvhpc.clm-crop Fails in build -- looks like it's a CTSM issue that I'll work on. 5.) NEON tests are probably the same as on Izumi 6.) SSP_D_Ld10.f10_f10_mg37.I1850Clm60Bgc.derecho_intel.clm-rtmColdSSP This is a CTSM specific test type for doing a spinup. It fails on submit with the following:
|
@ekluzek @billsacks This is due to an inconsistency in the way we name test mods and user mods. For testmod_dirs we require the component name in the path, for example: I think maybe the easiest solution is to add the component name in the user mods dir (or remove it from the test mods dir) to be consistant in the naming convention. I have tested this by moving the directories in |
@ekluzek your sandbox in /glade/work/erik/ctsm_worktrees/quickfix has not been updated to the latest tags. |
Thanks, @jedwards4b ! I think what you're saying is that, for a directory in the usermods_dirs space to be picked up via a test name, it would need to fall under a |
@jedwards4b ahh, you are right, I didn't pull the latest update that I had pushed. Thanks for the correction. Resending those tests. |
Once I changed the user mods path I got a ton of compiler errors from nag using test SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.izumi_nag.clm-NEON-MOAB--clm-PRISM. Changing the compiler to gnu allowed everything to pass. |
@jedwards4b thanks for pointing this out. Three comments here... Fist, I largely agree with @billsacks here that this is the best solution moving forward. Having "clm" in the name for BOTH testmods and usermods means you can know which component has that mod directory. And it also keeps their them consistent which is good as well. Second, the catch is that this will change behavior for the NEON and PLUMBER2 folks, and I want to run it by them first. I think we can convince them it's OK though. Third, looking at that cime commit it was in cime6.1.39, which was just after our latest CTSM tag ctsm5.3.014 which used cime6.1.37. Otherwise, we should have seen it sooner, as our previous tags were at cime6.0.246 as well. bf19cab32f984a5f2256b53af05f0ab63232bd95 So I'll check with our peeps and make sure this is OK, but we'll plan on that solution... |
Description of changes
Adds a timestamp to rpointer files in a backward compatible manor
Specific notes
Contributors other than yourself, if any:
CTSM Issues Fixed (include github issue #):
Are answers expected to change (and if so in what way)?
no
Any User Interface Changes (namelist or namelist defaults changes)?
Does this create a need to change or add documentation? Did you do so?
Submodules updated: Needs at least the first one updated...
cime6.1.47
share1.1.5
cmeps1.0.26
Testing performed, if any: will do regular
Things to do: