Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add timestamp to rpointer files #2757

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

jedwards4b
Copy link
Contributor

@jedwards4b jedwards4b commented Sep 12, 2024

Description of changes

Adds a timestamp to rpointer files in a backward compatible manor

Specific notes

Contributors other than yourself, if any:

CTSM Issues Fixed (include github issue #):

Are answers expected to change (and if so in what way)?
no
Any User Interface Changes (namelist or namelist defaults changes)?

Does this create a need to change or add documentation? Did you do so?

Submodules updated: Needs at least the first one updated...
cime6.1.47
share1.1.5
cmeps1.0.26

Testing performed, if any: will do regular

Things to do:

  • Get externals and testing working
  • Add a user writeup about this to ChangeLog
  • Fix LILAC
  • Code review changes

@wwieder
Copy link
Contributor

wwieder commented Sep 12, 2024

Thanks Jim. Can this go onto b4bdev, @ekluzek ?

@ekluzek ekluzek self-assigned this Sep 12, 2024
@ekluzek ekluzek added enhancement new capability or improved behavior of existing capability usability Improve or clarify user-facing options labels Sep 12, 2024
@ekluzek
Copy link
Collaborator

ekluzek commented Sep 12, 2024

Thanks @jedwards4b.

@wwieder yes this totally makes sense as something coming into b4b-dev. Since, it has backwards compatibility it doesn't need to be coordinated with other CESM tags or externals. So bringing it into b4b-dev and having it go into CTSM main-dev the next time a b4b-dev tag is made (in two weeks) makes a lot of sense.

@jedwards4b jedwards4b marked this pull request as draft September 19, 2024 20:05
@jedwards4b
Copy link
Contributor Author

I've run into an issue here. The clm_timemgr reads its clock information from the restart file on restart - which makes it hard to read the clock to read the restart file. It's also not a requirement to get this from the restart file as the driver has already set the clock.

@wwieder wwieder added this to the cesm3_0_beta04 milestone Sep 26, 2024
@jedwards4b jedwards4b force-pushed the add_timestamp_to_rpointers branch from 463bf55 to 25efa68 Compare September 26, 2024 20:57
@jedwards4b jedwards4b marked this pull request as ready for review September 26, 2024 20:59
@jedwards4b jedwards4b force-pushed the add_timestamp_to_rpointers branch from 25efa68 to 9f07cf9 Compare September 26, 2024 21:15
@jedwards4b
Copy link
Contributor Author

I have tested with ERS.ne30pg3_t232.BLT1850.derecho_intel.allactive-defaultio
and plan to do a complete set of cesm prealpha tests.

@samsrabin
Copy link
Collaborator

We discussed this at the CTSM SE meeting this morning and decided it would be in our cesm3_0_beta04 tag, which fits with @jedwards4b's timeline.

Update surface datasets, CN Matrix, CLM60: excess ice on, explicit A/C on, crop calendars, Sturm snow, Leung dust emissions, prigent roughness data

Purpose and description of changes since ctsm5.2.005
----------------------------------------------------

Bring in updates needed for the CESM3.0 science capability/functionality "chill". Most importantly bringing
in: CN Matrix to speed up spinup for the BGC model, updated surface datasets, updated Leung 2023 dust emissions,
explicit Air Conditioning for the Urban model, updates to crop calendars. For clm6_0 physics these options are now
default turned on in addition to Sturm snow, and excess ice.

Changes to CTSM Infrastructure:
===============================

 - manage_externals removed and replaced by git-fleximod
 - Ability to handle CAM7 in LND_TUNING_MODE

Changes to CTSM Answers:
========================

 Changes to defaults for clm6_0 physics:
  - Urban explicit A/C turned on
  - Snow thermal conductivity is now Sturm_1997
  - New IC file for f09 1850
  - New crop calendars
  - Dust emissions is now Leung_2023
  - Excess ice is turned on
  - Updates to MEGAN for BVOC's
  - Updates to BGC fire method

 Changes for all physics versions:

  - Parameter files updated
  - FATES parameter file updated
  - Glacier region 1 is now undefined
  - Update in FATES transient Land use
  - Pass active glacier (CISM) runoff directly to river model (MOSART)
  - Add the option for using matrix for Carbon/Nitrogen BGC spinup

New surface datasets:
=====================

- With new surface datasets the following GLC fields have region "1" set to UNSET:
     glacier_region_behavior, glacier_region_melt_behavior, glacier_region_ice_runoff_behavior
- Updates to allow creating transient landuse timeseries files going back to 1700.
- Fix an important bug on soil fields that was there since ctsm5.2.0. This results in mksurfdata_esmf now giving identical answers with a change in number of processors, as it should.
- Add in creation of ne0np4.POLARCAP.ne30x4 surface datasets.
- Add version to the surface datasets.
- Remove the --hires_pft option from mksurfdata_esmf as we don't have the datasets for it.
- Remove VIC fields from surface datasets.

New input datasets to mksurfdata_esmf:
======================================

- Updates in PFT/LAI/soil-color raw datasets (now from the TRENDY2024 timeseries that ends in 2023), as well as two fire datasets (AG fire, peatland), and the glacier behavior dataset.
Same as ctsm5.3.001

I made an accidental merge and reverted it.
@ekluzek
Copy link
Collaborator

ekluzek commented Dec 3, 2024

We are going to do this as a standalone tag to master, so I'll rebase to master.

@ekluzek ekluzek changed the base branch from b4b-dev to master December 4, 2024 16:53
Copy link
Collaborator

@ekluzek ekluzek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jedwards4b this is great, thanks for getting this out there for us. There's some nice improvements I saw you add (only reading the rpointer file on masterproc and catching some typos) which is great.

There are some changes that are required, and some I think would be good to do as they should be easy. They are outlined in the code changes. Right now I'm planning on just doing those changes. Feel free to comment on any of it though.

The required change is to move the updates into lnd_comp_esmf.F90 for LILAC.

Thanks again for the PR.

src/main/restFileMod.F90 Show resolved Hide resolved
src/utils/clm_time_manager.F90 Show resolved Hide resolved
src/utils/clm_time_manager.F90 Show resolved Hide resolved
src/cpl/nuopc/lnd_comp_nuopc.F90 Show resolved Hide resolved
@@ -1038,53 +1041,54 @@ subroutine ModelSetRunClock(gcomp, rc)
call ESMF_LogWrite(subname//'setting alarms for ' // trim(name), ESMF_LOGMSG_INFO)

!----------------
! Restart alarm
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jedwards4b here the order of setting the stop alarm and then restart alarm was switched. As far as I could see, there isn't a strict need to do this. I figured a preferred order might be stop first and then restart, maybe to be consistent elsewhere.

But, I wanted to make sure I wasn't missing anything. So is this a preferential change or one that's absolutely needed? Thanks in advance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there is a requirement to do this. When you request to write the restart at the end of the run, you need to know when the end of run is so that you can set the restart alarm, by initializing the stop alarm first I have the information I need to set the restart alarm.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks for the explanation that helps.

I'll add a comment about this then. And make sure the same is done in LILAC.

! Initialize start date from restart info

start_date = TimeSetymd( rst_start_ymd, rst_start_tod, "start_date" )
! Check start date from restart info
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timemgr_spmdbcast and init_calendar calls above can also be removed, because this now requires timemgr_init to be called first. As such we should check that

timemgr_set == .true.

and abort if not.

! Initialize clock

call init_clock( start_date, ref_date, curr_date)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the end also remove

if (masterproc) call timemgr_print()

As it's already done in the timemgr_init step previously. No reason to repeat.


!---------------------------------------------------------------------------------
! Restart the ESMF time manager using the synclock for ending date.
!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of this subroutine is now, just to do some checking, set a couple variables, and to do the advance.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 4, 2024

Running aux_clm on Derecho I'm seeing tons of tests passing 199, with only 12 pending, but 23 failing. LILAC fails as I expected, but a bunch of ERI, the SSP tests, and one ERP, a few ERS, one REP, and a few SMS tests fail at the RUN phase.

@jedwards4b
Copy link
Contributor Author

@ekluzek I haven't yet merged the cime PR that you will need for these tests, are you using the branch?

@jedwards4b
Copy link
Contributor Author

I just merged it - try updating to cime6.1.47

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 4, 2024

Ahh, OK, thanks @jedwards4b! I'll update to that and see how it goes.

@ekluzek ekluzek added the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Dec 5, 2024
@samsrabin samsrabin removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Dec 5, 2024
@jedwards4b
Copy link
Contributor Author

@ekluzek How is the testing going?

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 6, 2024

@jedwards4b I have a bunch of gnu tests with restarts on Derecho that fail early on in the driver. I think I've got the externals to what's needed (but let me know if you think I need to adjust something). For example this test fails:

ERI_D_Ld9.f10_f10_mg37.I1850Clm45Bgc.derecho_gnu.clm-default

cesm.log:
 cat /glade/derecho/scratch/erik/ERI_D_Ld9.f10_f10_mg37.I1850Clm45Bgc.derecho_gnu.clm-default.20241205_143241_jvy5z9/run/cesm.log.7074030.desched1.241205-143905
dec2323.hsn.de.hpc.ucar.edu 0:  (t_initf) Read in prof_inparm namelist from: drv_in
dec2323.hsn.de.hpc.ucar.edu 0:  (t_initf) Using profile_disable=          F
dec2323.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_timer=                      4
dec2323.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_depth_limit=               12
dec2323.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_detail_limit=               2
dec2323.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_barrier=          F
dec2323.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_outpe_num=                  1
dec2323.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_outpe_stride=               0
dec2323.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_single_file=      F
dec2323.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_global_stats=     T
dec2323.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_ovhd_measurement= F
dec2323.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_add_detail=       F
dec2323.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_papi_enable=      F
dec2323.hsn.de.hpc.ucar.edu 0:  ESMF_Finalize: Error closing trace stream
dec2323.hsn.de.hpc.ucar.edu 0: MPICH ERROR [Rank 0] [job id c8fac6fd-8a30-4e86-8c34-bdc808fa56f2] [Thu Dec  5 14:39:19 2024] [dec2323] - Abort(1) (rank 0 in comm 496): application called MPI_Abort(comm=0x84000002, 1) - process 0
dec2323.hsn.de.hpc.ucar.edu 0: 
dec2323.hsn.de.hpc.ucar.edu 0: 
dec2323.hsn.de.hpc.ucar.edu 0: Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
dec2323.hsn.de.hpc.ucar.edu 0: 
dec2323.hsn.de.hpc.ucar.edu 0: Backtrace for this error:
dec2323.hsn.de.hpc.ucar.edu 0: #0  0x150096deed4f in ???
dec2323.hsn.de.hpc.ucar.edu 0: 	at /usr/src/debug/glibc-2.31-150300.41.1.x86_64/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0
dec2323.hsn.de.hpc.ucar.edu 0: #1  0x15009990b0ee in ???
dec2323.hsn.de.hpc.ucar.edu 0: #2  0x15009971949e in ???
dec2323.hsn.de.hpc.ucar.edu 0: #3  0x150097d46927 in ???
dec2323.hsn.de.hpc.ucar.edu 0: #4  0x1500a18cde5d in _ZN5ESMCI3VMK5abortEv
dec2323.hsn.de.hpc.ucar.edu 0: 	at /glade/derecho/scratch/jedwards/tmp/spack-stage/spack-stage-esmf-8.6.0-bsogfa4e7dreitxbwm4gbppisw5q4x2t/spack-src/src/Infrastructure/VM/src/ESMCI_VMKernel.C:863
dec2323.hsn.de.hpc.ucar.edu 0: #5  0x1500a18c478f in _ZN5ESMCI2VM5abortEPi
dec2323.hsn.de.hpc.ucar.edu 0: 	at /glade/derecho/scratch/jedwards/tmp/spack-stage/spack-stage-esmf-8.6.0-bsogfa4e7dreitxbwm4gbppisw5q4x2t/spack-src/src/Infrastructure/VM/src/ESMCI_VM.C:3634
dec2323.hsn.de.hpc.ucar.edu 0: #6  0x1500a18f211e in c_esmc_vmabort_
dec2323.hsn.de.hpc.ucar.edu 0: 	at /glade/derecho/scratch/jedwards/tmp/spack-stage/spack-stage-esmf-8.6.0-bsogfa4e7dreitxbwm4gbppisw5q4x2t/spack-src/src/Infrastructure/VM/interface/ESMCI_VM_F.C:1252
dec2323.hsn.de.hpc.ucar.edu 0: #7  0x1500a2766c8b in __esmf_vmmod_MOD_esmf_vmabort
dec2323.hsn.de.hpc.ucar.edu 0: 	at /glade/derecho/scratch/jedwards/tmp/spack-stage/spack-stage-esmf-8.6.0-bsogfa4e7dreitxbwm4gbppisw5q4x2t/spack-src/src/Infrastructure/VM/interface/ESMF_VM.F90:9521
dec2323.hsn.de.hpc.ucar.edu 0: #8  0x1500a24856a9 in __esmf_initmod_MOD_esmf_finalize
dec2323.hsn.de.hpc.ucar.edu 0: 	at /glade/derecho/scratch/jedwards/tmp/spack-stage/spack-stage-esmf-8.6.0-bsogfa4e7dreitxbwm4gbppisw5q4x2t/spack-src/src/Superstructure/ESMFMod/src/ESMF_Init.F90:1682

On Izumi, a few intel and gnu ER tests fail as well, but all of the nag tests fail due to a build issue. I'll look into what's going on there closer in a bit.

If you could look at the issue for the Derecho gnu ER test above that would be great. Thanks in advance.

@jedwards4b
Copy link
Contributor Author

jedwards4b commented Dec 9, 2024

Requires ESCOMP/CESM_share#59 now share1.0.21
ESMCI/cime#4713. now cime6.1.49
and ESCOMP/CMEPS#518 now cmeps1.0.31

@ekluzek would you restart testing after updating these three externals and let me know how it goes? Thanks

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 9, 2024

@jedwards4b yep, I'm starting up testing now.

By, the way shouldn't the share tagname be share1.1.6 rather than 1.0.21?

@jedwards4b
Copy link
Contributor Author

Good point - I'll fix the share tag name, please use share1.1.6

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 10, 2024

More tests are working on Izumi now. But, several fails with these two issues below...

1.) ERP tests:

But, a bunch of ERP tests fail at the build step with this:

Command: ./case.build --sharedlib-only
Output: WARNING: Found difference in test REST_OPTION: case: ndays original value $STOP_OPTION
 Successfully created new case ERP_D_Ld5_P48x1.f10_f10_mg37.I1850Clm50Bgc.izumi_nag.clm-ciso.GC.ctsm5214rpointeracl_nag from clone case ERP_D_Ld5_P48x1.f10_f10_mg37.I1850Clm50Bgc.izumi_nag.clm-ciso.GC.ctsm5214rpointeracl_nag 
Setting resource.RLIMIT_STACK to -1 from (-1, -1)
Setting resource.RLIMIT_STACK to -1 from (-1, -1)
Building test for ERP in directory /scratch/cluster/erik/tests_ctsm5214rpointeracl/ERP_D_Ld5_P48x1.f10_f10_mg37.I1850Clm50Bgc.izumi_nag.clm-ciso.GC.ctsm5214rpointeracl_nag
WARNING: Test case setup failed. Case2 has been removed, but the main case may be in an inconsistent state. If you want to rerun this test, you should create a new test rather than trying to rerun this one.
Traceback (most recent call last):
  File "./case.build", line 267, in <module>
    _main_func(__doc__)
  File "./case.build", line 226, in _main_func
    test = find_system_test(testname, case)(case)
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/erp.py", line 29, in __init__
    **kwargs
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/restart_tests.py", line 30, in __init__
    **kwargs
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/system_tests_compare_two.py", line 146, in __init__
    self._setup_cases_if_not_yet_done()
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/system_tests_compare_two.py", line 450, in _setup_cases_if_not_yet_done
    self._setup_cases()
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/system_tests_compare_two.py", line 540, in _setup_cases
    self._case_one_setup()
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/restart_tests.py", line 36, in _case_one_setup
    self._set_restart_interval()
  File "/fs/cgd/data0/erik/ctsm_worktree/quickfix/cime/CIME/SystemTests/system_tests_common.py", line 198, in _set_restart_interval
    startdatetime = datetime.fromisoformat(startdate) + timedelta(
AttributeError: type object 'datetime.datetime' has no attribute 'fromisoformat'

 ---------------------------------------------------

However, even though it warns against this I can go into the test and run:

./case.build
./case.submit

and it seems to work fine, even though it warns against doing that in the error message above.

2.) NEON tests:

Also the NEON tests are now ALL failing because it can no longer find the NEON user-mod. It gives the following error:

For example for the test: SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.izumi_nag.clm-NEON-MOAB--clm-PRISM

2024-12-09 18:34:41: Could not locate testmod 'NEON/MOAB'

Those are tests that had been working for a very long time.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 10, 2024

On Derecho I have a ton, of ERI, ERP, SSP, and NEON fails:

ERI_C2_Ld9.f10_f10_mg37.I2000Clm60BgcCrop.derecho_gnu.clm-default	(NLCOMP RUN)		
ERI_D.ne30pg3_t232.I1850Clm60BgcCropG.derecho_intel.clm-clm60cam7LndTuningModeLDust	(NLCOMP RUN)		
ERI_D_Ld9.f10_f10_mg37.I1850Clm45Bgc.derecho_gnu.clm-default	(NLCOMP RUN)		
ERI_D_Ld9.f10_f10_mg37.I1850Clm60Bgc.derecho_gnu.clm-default	(NLCOMP RUN)		
ERI_D_Ld9.f10_f10_mg37.I1850Clm60Bgc.derecho_gnu.clm-default--clm-matrixcnOn	(NLCOMP RUN)		
ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-default	(NLCOMP RUN)		
ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default	(NLCOMP RUN)		
ERI_D_Ld9.ne30_g17.I2000Clm50BgcCru.derecho_intel.clm-vrtlay	(NLCOMP RUN)		
ERI_D_Ld9.ne30_g17.I2000Clm50BgcCru.derecho_intel.clm-vrtlay--clm-matrixcnOn	(NLCOMP RUN)		
ERI_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-default	(NLCOMP RUN)		
ERI_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-drydepnomegan	(NLCOMP RUN)		
ERI_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default	(NLCOMP RUN)		
ERI_Ld9.f45_g37.I2000Clm50BgcCru.derecho_intel.clm-nofire	(NLCOMP RUN)		
ERI_Ld9.f45_g37.I2000Clm50BgcCru.derecho_intel.clm-nofire--clm-matrixcnOn	(NLCOMP RUN)		
ERP_D.f10_f10_mg37.IHistClm60Bgc.derecho_gnu.clm-decStart	(NLCOMP RUN)		
ERP_D.f10_f10_mg37.IHistClm60Bgc.derecho_gnu.clm-decStart--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-decStart	(NLCOMP RUN)		
ERP_D_Ld10.f10_f10_mg37.I1850Clm60BgcCrop.derecho_intel.clm-ADspinup	(NLCOMP RUN)		
ERP_D_Ld10_P64x2.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-ciso_decStart	(NLCOMP RUN)		
ERP_D_Ld10_P64x2.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-ciso_decStart--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_Ld10_P64x2.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_Ld10_P64x2.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-default--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_Ld3_P64x2.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-default	(NLCOMP RUN)		
ERP_D_Ld3_P64x2.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_Ld3_PS.f09_g17.I2000Clm50Sp.derecho_intel.clm-prescribed	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_gnu.clm-drydepnomegan	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I1850Clm50BgcCropG.derecho_gnu.clm-glcMEC_changeFlags	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-ciso_flexCN_FUN	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-ciso_flexCN_FUN--clm-matrixcnOn	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-fire_emis	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-anoxia	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-anoxia--clm-matrixcnOn	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50Sp.derecho_gnu.clm-reduceOutput	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm50Sp.derecho_intel.clm-reduceOutput	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.I2000Clm60Sp.derecho_intel.clm-decStart	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.IHistClm45Sp.derecho_intel.clm-decStart	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.IHistClm50BgcCrop.derecho_intel.clm-allActive	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.IHistClm50BgcCrop.derecho_intel.clm-allActive--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.IHistClm50SpCru.derecho_gnu.clm-drydepnomegan--clm-nofireemis	(NLCOMP RUN)		
ERP_D_Ld5.f10_f10_mg37.IHistClm60Sp.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_Ld5.ne30pg3_t232.IHistClm60Sp.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_Ld9.ne30pg3_t232.I1850Clm60BgcCropG.derecho_intel.clm-clm60cam6LndTuningMode	(NLCOMP RUN)		
ERP_D_Ld9.ne30pg3_t232.I1850Clm60BgcCropG.derecho_intel.clm-clm60cam7LndTuningModeLDust	(NLCOMP RUN)		
ERP_D_Ld9.ne30pg3_t232.IHistClm60BgcCropG.derecho_intel.clm-clm60cam7LndTuningModeLDust	(NLCOMP RUN)		
ERP_D_P128x1_Ld26.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-crop--clm-midDecStart--clm-RxCropCalsAdaptGGCMI	(NLCOMP RUN)		
ERP_D_P64x2_Ld10.f10_f10_mg37.I2000Clm60Bgc.derecho_intel.clm-Hillslope	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-default	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.derecho_intel.clm-default--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I1850Clm60BgcCrop.derecho_gnu.clm-mimics	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm45BgcCrop.derecho_gnu.clm-no_subgrid_fluxes	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-default	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_gnu.clm-snowveg_norad	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-cn_conly	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-flexCN_FUN	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-flexCN_FUN--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-luna	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-noFUN_flexCN	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-noFUN_flexCN--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm60BgcCrop.derecho_intel.clm-coldStart	(NLCOMP RUN)		
ERP_D_P64x2_Ld3.f10_f10_mg37.I2000Clm60BgcCrop.derecho_intel.clm-coldStart--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_D_P64x2_Ld30.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_D_P64x2_Ld5.f10_f10_mg37.I2000Clm50BgcCropRtm.derecho_intel.clm-irrig_spunup	(NLCOMP RUN)		
ERP_D_P64x2_Ld5.f10_f10_mg37.I2000Clm60BgcCrop.derecho_intel.clm-irrig_spunup	(NLCOMP RUN)		
ERP_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_gnu.clm-default	(NLCOMP RUN)		
ERP_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_Ld9.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesColdAllVars	(NLCOMP RUN)		
ERP_Ld9.f45_g37.I2000Clm60Bgc.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_Ly3_P64x2.f10_f10_mg37.IHistClm50BgcCrop.derecho_intel.clm-cropMonthOutput	(NLCOMP RUN)		
ERP_Ly3_P64x2.f10_f10_mg37.IHistClm60BgcCrop.derecho_intel.clm-cropMonthOutput--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_P128x2_Ld30.f45_f45_mg37.I2000Clm60FatesSpCruRsGs.derecho_intel.clm-FatesColdSatPhen	(NLCOMP RUN)		
ERP_P256x2_D_Ld5.f19_g17_gris4.I1850Clm50BgcCropG.derecho_intel.clm-glcMEC_increase	(NLCOMP RUN)		
ERP_P256x2_Ld30.f45_f45_mg37.I2000Clm60FatesRs.derecho_intel.clm-mimicsFatesCold	(NLCOMP RUN)		EXPECTED (RUN)
ERP_P64x2_D.f10_f10_mg37.I2000Clm50SpRtmFl.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_P64x2_D_Ld10.f10_f10_mg37.IHistClm50SpG.derecho_intel.clm-glcMEC_decrease--clm-nofireemis	(NLCOMP RUN)		
ERP_P64x2_D_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-extra_outputs	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm45BgcCrop.derecho_intel.clm-crop	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm45BgcCru.derecho_intel.clm-ciso	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm45BgcCru.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.clm-ciso	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I1850Clm50Bgc.derecho_intel.clm-ciso--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I2000Clm45Sp.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I2000Clm50Sp.derecho_gnu.clm-default	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I2000Ctsm50NwpBgcCropGswp.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.I2000Ctsm50NwpSpGswp.derecho_intel.clm-default	(NLCOMP RUN)		
ERP_P64x2_D_Ld5.f10_f10_mg37.IHistClm45BgcCru.derecho_intel.clm-decStart	(NLCOMP RUN)		
ERP_P64x2_Ld1096.f10_f10_mg37.I2000Clm50BgcCrop.derecho_intel.clm-clm50cropIrrigMonth_interp	(NLCOMP RUN)		
ERP_P64x2_Ld1096.f10_f10_mg37.I2000Clm50BgcCrop.derecho_intel.clm-irrig_o3falk_reduceOutput	(NLCOMP RUN)		
ERP_P64x2_Ld366.f10_f10_mg37.I2000Clm50BgcCrop.derecho_intel.clm-irrig_alternate_monthly	(NLCOMP RUN)		
ERP_P64x2_Ld396.f10_f10_mg37.IHistClm60Bgc.derecho_gnu.clm-monthly	(NLCOMP RUN)		
ERP_P64x2_Ld396.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-monthly	(NLCOMP RUN)		
ERP_P64x2_Ld396.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-monthly--clm-matrixcnOn_ignore_warnings	(NLCOMP RUN)		
ERP_P64x2_Ld762.f10_f10_mg37.I2000Clm60BgcCrop.derecho_intel.clm-monthly	(NLCOMP RUN)		
ERS_D_Ld20.f45_f45_mg37.I2000Clm50FatesRs.derecho_intel.clm-FatesColdTwoStream	(NLCOMP COMPARE_base_rest)		EXPECTED (COMPARE_base_rest)
ERS_D_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-default	(NLCOMP RUN)		
ERS_D_Ld5_Mmpi-serial.1x1_mexicocityMEX.I1PtClm60SpRs.derecho_gnu.clm-CLM1PTStartDate	(NLCOMP RUN)		
ERS_D_Ld6.f10_f10_mg37.I1850Clm45BgcCrop.derecho_gnu.clm-clm50CMIP6frc	(NLCOMP RUN)		
ERS_D_Mmpi-serial_Ld5.1x1_brazil.I2000Clm50FatesRs.derecho_gnu.clm-FatesCold	(NLCOMP RUN)		
ERS_D_Mmpi-serial_Ld5.5x5_amazon.I2000Clm50FatesRs.derecho_gnu.clm-FatesCold	(NLCOMP RUN)		
ERS_L761.1x1_smallvilleIA.IHistClm50BgcCropQianRs.derecho_gnu.clm-smallville_dynurban_monthly	(XML)		
ERS_Ld3_D.f10_f10_mg37.I1850Clm50BgcCrop.derecho_gnu.clm-rad_hrly_light_res_half	(NLCOMP RUN)		
ERS_Ld765.1x1_smallvilleIA.IHistClm50BgcCropQianRs.derecho_gnu.clm-smallville_dynlakes_monthly	(NLCOMP RUN)		
ERS_P128x1_Ld762.f10_f10_mg37.I2000Clm60Fates.derecho_intel.clm-FatesColdNoComp	(NLCOMP RUN)		
LILACSMOKE_D_Ld2.f10_f10_mg37.I2000Ctsm50NwpSpAsRs.derecho_intel.clm-lilac	(NLCOMP MODEL_BUILD)		
REP_P64x2_Ld396.f10_f10_mg37.IHistClm60Bgc.derecho_intel.clm-monthly--clm-matrixcnOn_ignore_warnings	(NLCOMP COMPARE_base_rep2 BASELINE)		
SMS_D.f10_f10_mg37.I2000Clm60BgcCrop.derecho_nvhpc.clm-crop	(SHAREDLIB_BUILD NLCOMP)		EXPECTED (RUN)
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.derecho_gnu.clm-NEON-MOAB--clm-PRISM	(CREATE_NEWCASE)		EXPECTED (SHAREDLIB_BUILD RUN)
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.derecho_gnu.clm-default--clm-NEON-HARV	(CREATE_NEWCASE)		EXPECTED (SHAREDLIB_BUILD)
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.derecho_gnu.clm-default--clm-NEON-HARV--clm-matrixcnOn	(CREATE_NEWCASE)		
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Fates.derecho_gnu.clm-FatesPRISM--clm-NEON-FATES-YELL	(CREATE_NEWCASE)		EXPECTED (SHAREDLIB_BUILD RUN)
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Fates.derecho_intel.clm-FatesFireLightningPopDens--clm-NEON-FATES-NIWO	(CREATE_NEWCASE)		EXPECTED (SHAREDLIB_BUILD)
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60SpRs.derecho_intel.clm-default--clm-NEON-TOOL	(CREATE_NEWCASE)		
SSP_D_Ld10.f10_f10_mg37.I1850Clm60Bgc.derecho_intel.clm-rtmColdSSP	(NLCOMP RUN)		
SSP_D_Ld4.f10_f10_mg37.I1850Clm50BgcCrop.derecho_intel.clm-ciso_rtmColdSSP	(NLCOMP RUN)		
SSP_Ld10.f10_f10_mg37.I1850Clm50Bgc.derecho_gnu.clm-rtmColdSSP	(NLCOMP RUN)

Specific problems I looked at...

1.) ERS test for a specific year:

ERS_D_Ld5_Mmpi-serial.1x1_mexicocityMEX.I1PtClm60SpRs.derecho_gnu.clm-CLM1PTStartDate

atm.log:

(shr_strdata_readstrm) reading file lb: /glade/campaign/cesm/cesmdata/inputdata/atm/datm7/topo_forcing/topodata_0.9x1.25_USGS_070110_stream_c151201.nc       1
(shr_strdata_readstrm) reading file ub: /glade/campaign/cesm/cesmdata/inputdata/atm/datm7/topo_forcing/topodata_0.9x1.25_USGS_070110_stream_c151201.nc       1
 (datm_datamode_clmncep_advance): tbotmax =    290.55999755859375     
 (datm_datamode_clmncep_advance): anidrmax =    1.0000000000000000E+030
 atm : model date     19931205           0
 ERROR: shr_get_rpointer_nameERROR no rpointer file found in rpointer.cpl                                                                                                                                                                                                                                                     or in rpointer.cpl

The dated rpointer.cpl files ARE on disk though...

(ctsm_pylib) tests_ctsm5314rpointeracl/ERS_D_Ld5_Mmpi-serial.1x1_mexicocityMEX.I1PtClm60SpRs.derecho_gnu.clm-CLM1PTStartDate.GC.ctsm5314rpointeracl_gnu> cat run/rpointer.
rpointer.atm                   rpointer.cpl.1993-12-05-00000  rpointer.cpl.1993-12-07-00000  rpointer.lnd.1993-12-02-00000  rpointer.lnd.1993-12-05-00000  rpointer.lnd.1993-12-07-00000

cesm.log:

 (t_initf)       profile_outpe_num=                  1
 (t_initf)       profile_outpe_stride=               0
 (t_initf)       profile_single_file=      F
 (t_initf)       profile_global_stats=     T
 (t_initf)       profile_ovhd_measurement= F
 (t_initf)       profile_add_detail=       F
 (t_initf)       profile_papi_enable=      F
 ERROR: shr_get_rpointer_nameERROR no rpointer file found in rpointer.cpl                                                                                                                                                                                                                                                     or in rpointer.cpl
#0  0x125b1cf in __shr_abort_mod_MOD_shr_abort_backtrace
	at /glade/work/erik/ctsm_worktrees/quickfix/share/src/shr_abort_mod.F90:104
#1  0x125b292 in __shr_abort_mod_MOD_shr_abort_abort
	at /glade/work/erik/ctsm_worktrees/quickfix/share/src/shr_abort_mod.F90:61
#2  0x1255a3a in __nuopc_shr_methods_MOD_shr_get_rpointer_name
	at /glade/work/erik/ctsm_worktrees/quickfix/share/src/nuopc_shr_methods.F90:849
#3  0x55f78f in __med_phases_restart_mod_MOD_med_phases_restart_read
	at /glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../mediator/med_phases_restart_mod.F90:545
#4  0x46ab0b in datainitialize
	at /glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../mediator/med.F90:2191

med.log, drv.log, and lnd.log all look fine and don't report errors.

However, the drv.log does seem to have the right year, as follows. So possibly there's a missing broadcast of year?

drv.log:

  read rpointer file = rpointer.cpl.1993-12-05-00000
(/glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../cesm/driver/esm_time_mod.F90:esm_time_clockInit) reading driver restart from file = ERS_D_Ld5_Mmpi-serial.1x1_mexicocityMEX.I1PtClm60SpRs.derecho_gnu.clm-CLM1PTStartDate.GC.ctsm5314rpointeracl_gnu.cpl.r.1993-12-05-00000.nc
 (/glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../cesm/driver/esm_time_mod.F90:esm_time_clockInit): driver start_ymd:   19931202
 (/glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../cesm/driver/esm_time_mod.F90:esm_time_clockInit): driver start_tod:          0
 (/glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../cesm/driver/esm_time_mod.F90:esm_time_clockInit): driver curr_ymd:   19931205
 (/glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../cesm/driver/esm_time_mod.F90:esm_time_clockInit): driver curr_tod:          0
 (/glade/work/erik/ctsm_worktrees/quickfix/components/cmeps/cime_config/../cesm/driver/esm_time_mod.F90:esm_time_clockInit): driver time interval is :       3600

2.) ERI_D_Ld9.f10_f10_mg37.I2000Clm50BgcCru.derecho_intel.clm-default

dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf) Read in prof_inparm namelist from: drv_in
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf) Using profile_disable=          F
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_timer=                      4
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_depth_limit=               12
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_detail_limit=               2
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_barrier=          F
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_outpe_num=                  1
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_outpe_stride=               0
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_single_file=      F
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_global_stats=     T
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_ovhd_measurement= F
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_add_detail=       F
dec2324.hsn.de.hpc.ucar.edu 0:  (t_initf)       profile_papi_enable=      F
dec2324.hsn.de.hpc.ucar.edu 0:  ESMF_Finalize: Error closing trace stream
dec2324.hsn.de.hpc.ucar.edu 0: MPICH ERROR [Rank 0] [job id 889e4ccc-0d6a-437e-87c3-408c149f1bf9] [Mon Dec  9 19:36:23 2024] [dec2324] - Abort(1) (rank 0 in comm 496): application called MPI_Abort(comm=0x84000002, 1) - process 0
dec2324.hsn.de.hpc.ucar.edu 0: 
dec2324.hsn.de.hpc.ucar.edu 0: forrtl: severe (174): SIGSEGV, segmentation fault occurred
dec2324.hsn.de.hpc.ucar.edu 0: Image              PC                Routine            Line        Source             
dec2324.hsn.de.hpc.ucar.edu 0: libpthread-2.31.s  0000150B6470E8C0  Unknown               Unknown  Unknown
dec2324.hsn.de.hpc.ucar.edu 0: libmpi_intel.so.1  0000150B626CDE7E  Unknown               Unknown  Unknown
dec2324.hsn.de.hpc.ucar.edu 0: libmpi_intel.so.1  0000150B624DC22F  Unknown               Unknown  Unknown
dec2324.hsn.de.hpc.ucar.edu 0: libmpi_intel.so.1  0000150B60B096A8  MPI_Abort             Unknown  Unknown
dec2324.hsn.de.hpc.ucar.edu 0: libesmf.so         0000150B6D823E82  abort                     863  ESMCI_VMKernel.C
dec2324.hsn.de.hpc.ucar.edu 0: libesmf.so         0000150B6D81DD03  abort                    3634  ESMCI_VM.C
dec2324.hsn.de.hpc.ucar.edu 0: libesmf.so         0000150B6D848431  c_esmc_vmabort_          1252  ESMCI_VM_F.C
dec2324.hsn.de.hpc.ucar.edu 0: libesmf.so         0000150B6EDC3D87  esmf_vmmod_mp_esm        9521  ESMF_VM.F90
dec2324.hsn.de.hpc.ucar.edu 0: libesmf.so         0000150B6E8C6A58  esmf_initmod_mp_e        1684  ESMF_Init.F90
dec2324.hsn.de.hpc.ucar.edu 0: cesm.exe           0000000000449347  MAIN__                    132  esmApp.F90
dec2324.hsn.de.hpc.ucar.edu 0: cesm.exe           0000000000421A3D  Unknown               Unknown  Unknown
dec2324.hsn.de.hpc.ucar.edu 0: libc-2.31.so       0000150B6000029D  __libc_start_main     Unknown  Unknown
dec2324.hsn.de.hpc.ucar.edu 0: cesm.exe           000000000042196A  Unknown               Unknown  Unknown

3.) ERP_D_Ld5.f10_f10_mg37.IHistClm60Sp.derecho_intel.clm-default

Failure looks similar to above

4.) SMS_D.f10_f10_mg37.I2000Clm60BgcCrop.derecho_nvhpc.clm-crop

Fails in build -- looks like it's a CTSM issue that I'll work on.

5.) NEON tests are probably the same as on Izumi

6.) SSP_D_Ld10.f10_f10_mg37.I1850Clm60Bgc.derecho_intel.clm-rtmColdSSP

This is a CTSM specific test type for doing a spinup.

It fails on submit with the following:

Submitting job script qsub -q main -l walltime=00:20:00 -A P93300606 -l job_priority=regular -v ARGS_FOR_SCRIPT='--skip-preview-namelist' /glade/derecho/scratch/erik/tests_ctsm5314rpointeracl/SSP_D_Ld10.f10_f10_mg37.I1850Clm60Bgc.derecho_intel.clm-rtmColdSSP.GC.ctsm5314rpointeracl_int/.case.test
Submitted job id is 7129803.desched1
Submitted job case.test with id 7129803.desched1
submit_jobs case.test
Submit job case.test

 ---------------------------------------------------
2024-12-09 20:37:25: ERROR: Cannot modify case, read_only. Case must be opened with read_only=False and can only be modified within a context manager
 ---------------------------------------------------

@jedwards4b
Copy link
Contributor Author

@ekluzek @billsacks This is due to an inconsistency in the way we name test mods and user mods. For testmod_dirs we require the component name in the path, for example:
CTSM/cime_config/testdefs/testmods_dirs/clm/PRISM but for user mods directories we do not:
CTSM/cime_config/usermods_dirs/NEON/MOAB.
Jason Boutte refactored this function in commit bf19cab32f984a5f2256b53af05f0ab63232bd95
which was merged in tag cime6.0.246.

I think maybe the easiest solution is to add the component name in the user mods dir (or remove it from the test mods dir) to be consistant in the naming convention. I have tested this by moving the directories in /home/jedwards/CTSM/cime_config/usermods_dirs/ to /home/jedwards/CTSM/cime_config/usermods_dirs/clm. And have confirmed that this solves the problem. What do you think of this solution?

@jedwards4b
Copy link
Contributor Author

@ekluzek your sandbox in /glade/work/erik/ctsm_worktrees/quickfix has not been updated to the latest tags.

@billsacks
Copy link
Member

This is due to an inconsistency in the way we name test mods and user mods. For testmod_dirs we require the component name in the path, for example: CTSM/cime_config/testdefs/testmods_dirs/clm/PRISM but for user mods directories we do not: CTSM/cime_config/usermods_dirs/NEON/MOAB. Jason Boutte refactored this function in commit bf19cab32f984a5f2256b53af05f0ab63232bd95 which was merged in tag cime6.0.246.

I think maybe the easiest solution is to add the component name in the user mods dir (or remove it from the test mods dir) to be consistant in the naming convention. I have tested this by moving the directories in /home/jedwards/CTSM/cime_config/usermods_dirs/ to /home/jedwards/CTSM/cime_config/usermods_dirs/clm. And have confirmed that this solves the problem. What do you think of this solution?

Thanks, @jedwards4b ! I think what you're saying is that, for a directory in the usermods_dirs space to be picked up via a test name, it would need to fall under a clm subdirectory. Just wanting to make sure that the issue is limited to picking up usermods via test names and isn't more general. If so, I support your simple suggestion for a fix if it's okay with @ekluzek .

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 10, 2024

@jedwards4b ahh, you are right, I didn't pull the latest update that I had pushed. Thanks for the correction. Resending those tests.

@jedwards4b
Copy link
Contributor Author

Once I changed the user mods path I got a ton of compiler errors from nag using test SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60Bgc.izumi_nag.clm-NEON-MOAB--clm-PRISM. Changing the compiler to gnu allowed everything to pass.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 10, 2024

@ekluzek @billsacks This is due to an inconsistency in the way we name test mods and user mods. For testmod_dirs we require the component name in the path, for example: CTSM/cime_config/testdefs/testmods_dirs/clm/PRISM but for user mods directories we do not: CTSM/cime_config/usermods_dirs/NEON/MOAB. Jason Boutte refactored this function in commit bf19cab32f984a5f2256b53af05f0ab63232bd95 which was merged in tag cime6.0.246.

I think maybe the easiest solution is to add the component name in the user mods dir (or remove it from the test mods dir) to be consistant in the naming convention. I have tested this by moving the directories in /home/jedwards/CTSM/cime_config/usermods_dirs/ to /home/jedwards/CTSM/cime_config/usermods_dirs/clm. And have confirmed that this solves the problem. What do you think of this solution?

@jedwards4b thanks for pointing this out. Three comments here...

Fist, I largely agree with @billsacks here that this is the best solution moving forward. Having "clm" in the name for BOTH testmods and usermods means you can know which component has that mod directory. And it also keeps their them consistent which is good as well.

Second, the catch is that this will change behavior for the NEON and PLUMBER2 folks, and I want to run it by them first. I think we can convince them it's OK though.

Third, looking at that cime commit it was in cime6.1.39, which was just after our latest CTSM tag ctsm5.3.014 which used cime6.1.37. Otherwise, we should have seen it sooner, as our previous tags were at cime6.0.246 as well.

bf19cab32f984a5f2256b53af05f0ab63232bd95

So I'll check with our peeps and make sure this is OK, but we'll plan on that solution...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new capability or improved behavior of existing capability usability Improve or clarify user-facing options
Projects
Status: In progress - master/b4b-dev
Status: In Progress
Development

Successfully merging this pull request may close these issues.

5 participants