Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct ocean conservation check settings #6643

Merged
merged 7 commits into from
Sep 27, 2024

Conversation

mark-petersen
Copy link
Contributor

@mark-petersen mark-petersen commented Sep 24, 2024

Currently, the ocean conservation check analysis member overwrites the first entry in the file with a zero after restarts for some variables. This PR corrects this behavior so that the first day's entry in a monthly conservation check file is identical between continuous runs and a run with a restart break.

Fixes #6642

[NML] for some mpaso resolutions
[BFB] for all tested files

Copy link

github-actions bot commented Sep 24, 2024

PR Preview Action v1.4.8
🚀 Deployed preview to https://E3SM-Project.github.io/E3SM/pr-preview/pr-6643/
on branch gh-pages at 2024-09-26 18:05 UTC

<config_AM_conservationCheck_enable ocn_grid="FRISwISC02to60E3r1">.true.</config_AM_conservationCheck_enable>
<config_AM_conservationCheck_enable ocn_grid="FRISwISC01to60E3r1">.true.</config_AM_conservationCheck_enable>
<config_AM_conservationCheck_enable ocn_grid="RRSwISC6to18E3r5">.true.</config_AM_conservationCheck_enable>
<config_AM_conservationCheck_enable>.true.</config_AM_conservationCheck_enable>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mark-petersen, we may want this turned off for QU240, etc. For light-weight testing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's explicitly set this to .false. for:

  • oQU240
  • oQU480
  • oQU240wLI
  • oQU120

These meshes are used for testing:
https://github.com/E3SM-Project/E3SM/blob/master/cime_config/tests.py
and we don't want to needlessly increase test burden.

<config_AM_conservationCheck_enable>false</config_AM_conservationCheck_enable>
<config_AM_conservationCheck_enable>true</config_AM_conservationCheck_enable>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a separate PR, I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xylar -- I agree, especially since mpas-seaice is also used for F-cases where its performance is already an issue

@proteanplanet
Copy link
Contributor

proteanplanet commented Sep 24, 2024

Conservation checks in both the sea ice and ocean models should not be set as a default. It was actually a mistake that they were left on in the ocean for V3 production runs - they should have been switched off. They add significant computational expense - up to 30% for the sea ice model alone. Please remove the lines in components/mpas-seaice/bld/namelist_files/namelist_defaults_mpassi.xml from this PR. Also please consider removing this as default behavior for the ocean. Conservation checks should be used for development purposes, but once we know models are conserving after checks run for PRs, the analysis members should be kept off.

@xylar
Copy link
Contributor

xylar commented Sep 24, 2024

@mark-petersen, in light of @proteanplanet's comment above, which makes sense to me, can you actually turn the conservation analysis member off by default for all meshes in this PR? I think we don't want to get distracted form the main objective here, which was to fix lack of restart capability in the ocean conservation analysis member.

@rljacob
Copy link
Member

rljacob commented Sep 24, 2024

Could we add a restart test that does turn on the conservation analysis?

@xylar
Copy link
Contributor

xylar commented Sep 24, 2024

@rljacob, that's an excellent idea but will CIME notice whether the output is BFB here? I seem to recall that ERS tests don't include output from analysis, but maybe I'm misremembering.

@jonbob
Copy link
Contributor

jonbob commented Sep 24, 2024

@rljacob -- we can do that, though I think mpas analysis member output is ignored in the tests

@xylar
Copy link
Contributor

xylar commented Sep 24, 2024

@mark-petersen, are you comfortable adding a test with the conservation check on or would you like me to do it?

@xylar
Copy link
Contributor

xylar commented Sep 24, 2024

@jonbob, jinx!

@mark-petersen
Copy link
Contributor Author

I made conservation checks default off for both sea ice and ocean, per comment by @proteanplanet above. I'm checking the restart capability now.

@xylar xylar changed the title Correct ocean and sea ice conservation check settings Correct ocean conservation check settings Sep 25, 2024
@xylar xylar removed the mpas-seaice label Sep 25, 2024
Comment on lines -1247 to -1265
<config_AM_conservationCheck_compute_on_startup ocn_grid="SOwISC12to60E2r4">.true.</config_AM_conservationCheck_compute_on_startup>
<config_AM_conservationCheck_compute_on_startup ocn_grid="ECwISC30to60E2r1">.true.</config_AM_conservationCheck_compute_on_startup>
<config_AM_conservationCheck_compute_on_startup ocn_grid="IcoswISC30E3r5">.true.</config_AM_conservationCheck_compute_on_startup>
<config_AM_conservationCheck_compute_on_startup ocn_grid="IcosXISC30E3r7">.true.</config_AM_conservationCheck_compute_on_startup>
<config_AM_conservationCheck_compute_on_startup ocn_grid="FRISwISC08to60E3r1">.true.</config_AM_conservationCheck_compute_on_startup>
<config_AM_conservationCheck_compute_on_startup ocn_grid="FRISwISC04to60E3r1">.true.</config_AM_conservationCheck_compute_on_startup>
<config_AM_conservationCheck_compute_on_startup ocn_grid="FRISwISC02to60E3r1">.true.</config_AM_conservationCheck_compute_on_startup>
<config_AM_conservationCheck_compute_on_startup ocn_grid="FRISwISC01to60E3r1">.true.</config_AM_conservationCheck_compute_on_startup>
<config_AM_conservationCheck_compute_on_startup ocn_grid="RRSwISC6to18E3r5">.true.</config_AM_conservationCheck_compute_on_startup>
<config_AM_conservationCheck_write_on_startup>.false.</config_AM_conservationCheck_write_on_startup>
<config_AM_conservationCheck_write_on_startup ocn_grid="SOwISC12to60E2r4">.true.</config_AM_conservationCheck_write_on_startup>
<config_AM_conservationCheck_write_on_startup ocn_grid="ECwISC30to60E2r1">.true.</config_AM_conservationCheck_write_on_startup>
<config_AM_conservationCheck_write_on_startup ocn_grid="IcoswISC30E3r5">.true.</config_AM_conservationCheck_write_on_startup>
<config_AM_conservationCheck_write_on_startup ocn_grid="IcosXISC30E3r7">.true.</config_AM_conservationCheck_write_on_startup>
<config_AM_conservationCheck_write_on_startup ocn_grid="FRISwISC08to60E3r1">.true.</config_AM_conservationCheck_write_on_startup>
<config_AM_conservationCheck_write_on_startup ocn_grid="FRISwISC04to60E3r1">.true.</config_AM_conservationCheck_write_on_startup>
<config_AM_conservationCheck_write_on_startup ocn_grid="FRISwISC02to60E3r1">.true.</config_AM_conservationCheck_write_on_startup>
<config_AM_conservationCheck_write_on_startup ocn_grid="FRISwISC01to60E3r1">.true.</config_AM_conservationCheck_write_on_startup>
<config_AM_conservationCheck_write_on_startup ocn_grid="RRSwISC6to18E3r5">.true.</config_AM_conservationCheck_write_on_startup>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we want all these changes based on my comment #6642 (comment)

@xylar
Copy link
Contributor

xylar commented Sep 25, 2024

@mark-petersen, I've got a version of your branch with my suggested modifications here:
https://github.com/xylar/E3SM/commits/mark-petersen/mpas/correct-conservation-check/
It includes a new stealth test. I have included MPAS-Ocean output files in that test by using sed to modify the env_archive.xml file. Not sure that's kosher but I couldn't find a better way.

The test passes and results can be seen here:

/lcrc/group/e3sm/ac.xasay-davis/scratch/chrys/ERS_D.T62_oQU240.GMPAS-IAF.chrysalis_intel.mpaso-conservation_check.20240925_060737_57gegd

@mark-petersen mark-petersen force-pushed the mark-petersen/mpas/correct-conservation-check branch from 981303a to ff5aa6f Compare September 25, 2024 13:47
@mark-petersen
Copy link
Contributor Author

Updated using @xylar corrections based on previous comment and comment on issue page. Thanks! Retesting now...

Comment on lines 1 to 2
# include mpas-ocean outputs in testing
sed -i 's#compclass="ocn" exclude_testing="true"#compclass="ocn" exclude_testing="false"#g' env_archive.xml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonbob and @rljacob, let me know what you think if this approach. Happy to hear of other suggestions for how to not exclude MPAS-Ocean files from this particular test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since its an xml file, can xmlchange do that? If not, sed is fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't figure out how. I think xmlchange can only be used to modify specific fields that have an id and value, and this is not one of those.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I slacked Jim to ask if he knows...

@mark-petersen
Copy link
Contributor Author

Updated the description.

@jonbob jonbob added the NML label Sep 25, 2024
@mark-petersen
Copy link
Contributor Author

mark-petersen commented Sep 25, 2024

I can now confirm the expected behavior from this PR.

I ran QU240 with config_am_conservationcheck_enable = .true. for all cases, and everything standard from the repo for four cases here /lcrc/group/e3sm/ac.mpetersen/scratch/chrys/

  1. master, 2 months 240924i.ConsCheckMasterFull.T62_oQU240.GMPAS-IAF.chr/
  2. master 1mo+restart+1 mo 240924j.ConsCheckMasterRestart.T62_oQU240.GMPAS-IAF.chr/
  3. pr 2 months 240924k.ConsCheckPrFull.T62_oQU240.GMPAS-IAF.chr/
  4. pr 1mo+restart+1 mo 240924l.ConsCheckPrRestart.T62_oQU240.GMPAS-IAF.chr/

writing restarts every month. For reference, I used

./create_newcase -case $CASE_ROOT/$E3SM_CASE   -compiler gnu   -mach chrysalis   -project e3sm \
   -compset GMPAS-IAF -res T62_oQU240 --walltime 00:30:00

In case 2, the first entry differs from case 1 by being either zero or differing:

$ ncdump -v accumulatedRainFlux  *mpaso.hist.am.conservationCheck*01-02-01.nc |tail
 accumulatedRainFlux = 12140753948.2218, 12114183191.7286, 12117918123.2804,  (case 1)
 accumulatedRainFlux = 12129038639.0062, 12114183191.7286, 12117918123.2804, (case 2)
...

$ ncdump -v accumulatedRiverRunoffTemperatureFlux *mpaso.hist.am.conservationCheck*01-02-01.nc | tail
 accumulatedRiverRunoffTemperatureFlux = 18589454.7081383, 18605091.7352952, (case 1)
 accumulatedRiverRunoffTemperatureFlux = 0, 18605091.7352952, (case 2)

Running from this PR, case 3 and case 4 have identical output, identical to case 1 above.

Copy link
Contributor

@xylar xylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving based on @mark-petersen's testing and my own.

@jonbob
Copy link
Contributor

jonbob commented Sep 25, 2024

I've been testing with the new testdef and don't believe it's working correctly -- first, it doesn't seem to change "exclude_testing" for mpaso in env_archive.xml. However, even if I change that by hand, it still does not run cprnc on the conservation am file -- it might take some work to get testing to recognize that file

@xylar
Copy link
Contributor

xylar commented Sep 25, 2024

@jonbob, that's odd. I think it changed the file for me. In:

/lcrc/group/e3sm/ac.xylar/scratch/chrys/ERS_D.T62_oQU240.GMPAS-IAF.chrysalis_intel.mpaso-conservation_check.20240925_060737_57gegd

I'm seeing:

    <comp_archive_spec compname="mpaso" compclass="ocn" exclude_testing="false">
      <rest_file_extension>rst</rest_file_extension>
...

as expected.

As for the latter point. I wasn't sure how to find out what cprnc is checking. I guess we'll need @jgfouca's help to figure out another way.

@jonbob
Copy link
Contributor

jonbob commented Sep 25, 2024

@xylar -- that's very odd... Anyway, Jim didn't see a way to do it with xmlchange either.

@xylar
Copy link
Contributor

xylar commented Sep 25, 2024

@jonbob, I'll try it again. In the meantime, can you clarify what you are doing to see what files cprnc is looking at?

@xylar
Copy link
Contributor

xylar commented Sep 25, 2024

I reset to the current state of this branch and I'm also seeing that the sed changes aren't happening -- so odd!

@xylar
Copy link
Contributor

xylar commented Sep 25, 2024

My fault. I accidentally had e3sm-unified loaded when I launched the test. It must have a better sed that supports -i but the default sed on Chrysalis doesn't (!!!).

I'll have to do this more manually.

@jonbob
Copy link
Contributor

jonbob commented Sep 25, 2024

@xylar -- there are files that get made for tests -- TestStatus and TestStatus.log. If I grep TestStatus.log for cprnc, I only see that being run on the cpl.hi files

@xylar
Copy link
Contributor

xylar commented Sep 25, 2024

Thanks @jonbob. I'll try to debug some more tomorrow.

@xylar
Copy link
Contributor

xylar commented Sep 25, 2024

On second thought, I'm beginning to feel like getting an ERS test to work with MPAS-Ocean history output is too big a project to be in the scope of this PR.

@jonbob and @rljacob, I think the only plausible fallback is to take out the stealth test. What about you?

@jonbob
Copy link
Contributor

jonbob commented Sep 25, 2024

@xylar -- I was going to recommend the same thing. I think it's more important to fix the functionality, and we can add a test once we get it working

@rljacob
Copy link
Member

rljacob commented Sep 25, 2024

Don't try to do the sed magic and force the comparison but leave on the test that enables the conservation analysis members.

@xylar
Copy link
Contributor

xylar commented Sep 26, 2024

@rljacob, okay, good suggestion. I have noted in the README for the test that it does not currently include MPAS-Ocean history files. I have given a manual change that folks could make to cime_config/config_archive.xml if they want to include these files.

However, it shoudl be noted that MPAS-Ocean history files are not currently
included in E3SM testing so non-BFB results will not be detected unless one
manually changes to 'compname="mpaso" exclude_testing="false"' in the file
cime_config/config_archive.xml.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one quick comment -- even changing exclude_testing won't check any of the mpaso history files. Plus there's a minor misspelling: "shoudl"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonbob, okay, I understood that it would but I must be mistaken. I have removed this text and fixed the typo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xylar -- it should, but in practice none of the history files matches the file check in env_archive.xml. I did play around with it, and it works if we modify that file with this:

    <comp_archive_spec compclass="ocn" compname="mpaso">
      <rest_file_extension>rst</rest_file_extension>
      <rest_file_extension>rst.am.timeSeriesStatsMonthly</rest_file_extension>
      <hist_file_extension>hist.am.conservationCheck\..*\.nc$</hist_file_extension>
      <rest_history_varname>unset</rest_history_varname>

where I added the conservationCheck line -- or actually replaced a line that had no effect. Now it actually does use the conservationCheck am for the test: this from TestStatus.log:

    ERS_Ld5_D.T62_oQU240.GMPAS-IAF.chrysalis_intel.mpaso-conservation_check.20240925_131015_n3zwao.mpaso.hist.am.conservationCheck.0001-01-01.nc.base matched ERS_Ld5_D.T62_oQU240.GMPAS-IAF.chrysalis_intel.mpaso-conservation_check.20240925_131015_n3zwao.mpaso.hist.am.conservationCheck.0001-01-01.nc.rest
....
tail -n20 /lcrc/group/e3sm/ac.jwolfe/scratch/chrys/ERS_Ld5_D.T62_oQU240.GMPAS-IAF.chrysalis_intel.mpaso-conservation_check.20240925_131015_n3zwao/run/ERS_Ld5_D.T62_oQU240.GMPAS-IAF.chrysalis_intel.mpaso-conservation_check.20240925_131015_n3zwao.mpaso.hist.am.conservationCheck.0001-01-01.nc.base.cprnc.out

                   1   0.000000000000000E+00   0.000000000000000E+00
                   1   0.000000000000000E+00   0.000000000000000E+00
                   1  (     1) (     1)
          avg abs field values:    0.000000000000000E+00
                                   0.000000000000000E+00
************************************************************************************************************************************

SUMMARY of cprnc:
 A total number of    290 fields were compared
          of which      0 had non-zero differences
               and      0 had differences in fill patterns
               and      0 had different dimension sizes
               and      0 had different data types
 A total number of      5 fields could not be analyzed
 A total number of      0 time-varying fields on file 1 were not found on file 2.
 A total number of      0 time-constant fields on file 1 were not found on file 2.
 A total number of      0 time-varying fields on file 2 were not found on file 1.
 A total number of      0 time-constant fields on file 2 were not found on file 1.
  diff_test: the two files seem to be IDENTICAL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, clearly a bigger can of worms that remains beyond the scope of this PR. Thanks for being more thorough than I was and for keeping things honest.

@jonbob
Copy link
Contributor

jonbob commented Sep 26, 2024

@xylar -- I tested this in the shell_commands and it does give us what we want:

sed -i 's/\(compname="mpaso" exclude_testing="true"\)/compname="mpaso"/' env_archive.xml
sed -i '/\(compname="mpaso"\)/,/<hist_file_extension>/ s/<hist_file_extension>hist/<hist_file_extension>hist.am.conservationCheck\\..*\\.nc$/' env_archive.xml

I'm not convinced we need to make the test completely functional for this PR, but I was curious about how difficult it would be...

Copy link
Member

@darincomeau darincomeau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approve the changes to the conservation check based on other's testing and visual inspection.
I'll defer to others on the added testing.

@rljacob rljacob added this to the v3.0.1 milestone Sep 26, 2024
jonbob added a commit that referenced this pull request Sep 26, 2024
…t (PR #6643)

Correct ocean conservation check settings

Currently, the ocean conservation check analysis member overwrites the
first entry in the file with a zero after restarts for some variables.
This PR corrects this behavior so that the first day's entry in a
monthly conservation check file is identical between continuous runs and
a run with a restart break. Adds a new mpaso testdef and corresponding
stealth test.

Fixes #6642

[NML] for some mpaso resolutions
[BFB]
@jonbob
Copy link
Contributor

jonbob commented Sep 26, 2024

Passes:

  • ERP_Ld3.ne30pg2_r05_IcoswISC30E3r5.WCYCL1850.chrysalis_intel.allactive-pioroot1 (with expected NML DIFF)
  • ERS_Ld5.T62_oQU120.CMPASO-NYF.chrysalis_intel

merged to next

@jonbob jonbob merged commit b12eb68 into master Sep 27, 2024
9 checks passed
@jonbob jonbob deleted the mark-petersen/mpas/correct-conservation-check branch September 27, 2024 15:18
@jonbob
Copy link
Contributor

jonbob commented Sep 27, 2024

merged to master and expected NML DIFFs blessed

xylar added a commit to xylar/compass that referenced this pull request Oct 26, 2024
This merge updates the E3SM-Project submodule from [727ad81](https://github.com/E3SM-Project/E3SM/tree/727ad81) to [1442143](https://github.com/E3SM-Project/E3SM/tree/1442143).

This update includes the following MPAS-Ocean and MPAS-Frameworks PRs (check mark indicates bit-for-bit with previous PR in the list):
- [ ]  (ocn) E3SM-Project/E3SM#6509
- [ ]  (ocn) E3SM-Project/E3SM#6508
- [ ]  (fwk) E3SM-Project/E3SM#6575
- [ ]  (ocn) E3SM-Project/E3SM#6590
- [ ]  (fwk) E3SM-Project/E3SM#6643
- [ ]  (ocn) E3SM-Project/E3SM#6656
- [ ]  (ocn) E3SM-Project/E3SM#6672
- [ ]  (ocn) E3SM-Project/E3SM#6659
- [ ]  (ocn) E3SM-Project/E3SM#6497
- [ ]  (ocn) E3SM-Project/E3SM#6485
- [ ]  (ocn) E3SM-Project/E3SM#6566
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ocean conservation check accumulated sums set to zero at restart.
6 participants