-
Notifications
You must be signed in to change notification settings - Fork 374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct ocean conservation check settings #6643
Conversation
|
<config_AM_conservationCheck_enable ocn_grid="FRISwISC02to60E3r1">.true.</config_AM_conservationCheck_enable> | ||
<config_AM_conservationCheck_enable ocn_grid="FRISwISC01to60E3r1">.true.</config_AM_conservationCheck_enable> | ||
<config_AM_conservationCheck_enable ocn_grid="RRSwISC6to18E3r5">.true.</config_AM_conservationCheck_enable> | ||
<config_AM_conservationCheck_enable>.true.</config_AM_conservationCheck_enable> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mark-petersen, we may want this turned off for QU240, etc. For light-weight testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's explicitly set this to .false.
for:
oQU240
oQU480
oQU240wLI
oQU120
These meshes are used for testing:
https://github.com/E3SM-Project/E3SM/blob/master/cime_config/tests.py
and we don't want to needlessly increase test burden.
<config_AM_conservationCheck_enable>false</config_AM_conservationCheck_enable> | ||
<config_AM_conservationCheck_enable>true</config_AM_conservationCheck_enable> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a separate PR, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @xylar -- I agree, especially since mpas-seaice is also used for F-cases where its performance is already an issue
Conservation checks in both the sea ice and ocean models should not be set as a default. It was actually a mistake that they were left on in the ocean for V3 production runs - they should have been switched off. They add significant computational expense - up to 30% for the sea ice model alone. Please remove the lines in components/mpas-seaice/bld/namelist_files/namelist_defaults_mpassi.xml from this PR. Also please consider removing this as default behavior for the ocean. Conservation checks should be used for development purposes, but once we know models are conserving after checks run for PRs, the analysis members should be kept off. |
@mark-petersen, in light of @proteanplanet's comment above, which makes sense to me, can you actually turn the conservation analysis member off by default for all meshes in this PR? I think we don't want to get distracted form the main objective here, which was to fix lack of restart capability in the ocean conservation analysis member. |
Could we add a restart test that does turn on the conservation analysis? |
@rljacob, that's an excellent idea but will CIME notice whether the output is BFB here? I seem to recall that ERS tests don't include output from analysis, but maybe I'm misremembering. |
@rljacob -- we can do that, though I think mpas analysis member output is ignored in the tests |
@mark-petersen, are you comfortable adding a test with the conservation check on or would you like me to do it? |
@jonbob, jinx! |
I made conservation checks default off for both sea ice and ocean, per comment by @proteanplanet above. I'm checking the restart capability now. |
<config_AM_conservationCheck_compute_on_startup ocn_grid="SOwISC12to60E2r4">.true.</config_AM_conservationCheck_compute_on_startup> | ||
<config_AM_conservationCheck_compute_on_startup ocn_grid="ECwISC30to60E2r1">.true.</config_AM_conservationCheck_compute_on_startup> | ||
<config_AM_conservationCheck_compute_on_startup ocn_grid="IcoswISC30E3r5">.true.</config_AM_conservationCheck_compute_on_startup> | ||
<config_AM_conservationCheck_compute_on_startup ocn_grid="IcosXISC30E3r7">.true.</config_AM_conservationCheck_compute_on_startup> | ||
<config_AM_conservationCheck_compute_on_startup ocn_grid="FRISwISC08to60E3r1">.true.</config_AM_conservationCheck_compute_on_startup> | ||
<config_AM_conservationCheck_compute_on_startup ocn_grid="FRISwISC04to60E3r1">.true.</config_AM_conservationCheck_compute_on_startup> | ||
<config_AM_conservationCheck_compute_on_startup ocn_grid="FRISwISC02to60E3r1">.true.</config_AM_conservationCheck_compute_on_startup> | ||
<config_AM_conservationCheck_compute_on_startup ocn_grid="FRISwISC01to60E3r1">.true.</config_AM_conservationCheck_compute_on_startup> | ||
<config_AM_conservationCheck_compute_on_startup ocn_grid="RRSwISC6to18E3r5">.true.</config_AM_conservationCheck_compute_on_startup> | ||
<config_AM_conservationCheck_write_on_startup>.false.</config_AM_conservationCheck_write_on_startup> | ||
<config_AM_conservationCheck_write_on_startup ocn_grid="SOwISC12to60E2r4">.true.</config_AM_conservationCheck_write_on_startup> | ||
<config_AM_conservationCheck_write_on_startup ocn_grid="ECwISC30to60E2r1">.true.</config_AM_conservationCheck_write_on_startup> | ||
<config_AM_conservationCheck_write_on_startup ocn_grid="IcoswISC30E3r5">.true.</config_AM_conservationCheck_write_on_startup> | ||
<config_AM_conservationCheck_write_on_startup ocn_grid="IcosXISC30E3r7">.true.</config_AM_conservationCheck_write_on_startup> | ||
<config_AM_conservationCheck_write_on_startup ocn_grid="FRISwISC08to60E3r1">.true.</config_AM_conservationCheck_write_on_startup> | ||
<config_AM_conservationCheck_write_on_startup ocn_grid="FRISwISC04to60E3r1">.true.</config_AM_conservationCheck_write_on_startup> | ||
<config_AM_conservationCheck_write_on_startup ocn_grid="FRISwISC02to60E3r1">.true.</config_AM_conservationCheck_write_on_startup> | ||
<config_AM_conservationCheck_write_on_startup ocn_grid="FRISwISC01to60E3r1">.true.</config_AM_conservationCheck_write_on_startup> | ||
<config_AM_conservationCheck_write_on_startup ocn_grid="RRSwISC6to18E3r5">.true.</config_AM_conservationCheck_write_on_startup> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we want all these changes based on my comment #6642 (comment)
@mark-petersen, I've got a version of your branch with my suggested modifications here: The test passes and results can be seen here:
|
981303a
to
ff5aa6f
Compare
Updated using @xylar corrections based on previous comment and comment on issue page. Thanks! Retesting now... |
# include mpas-ocean outputs in testing | ||
sed -i 's#compclass="ocn" exclude_testing="true"#compclass="ocn" exclude_testing="false"#g' env_archive.xml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since its an xml file, can xmlchange do that? If not, sed is fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't figure out how. I think xmlchange
can only be used to modify specific fields that have an id
and value
, and this is not one of those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I slacked Jim to ask if he knows...
Updated the description. |
I can now confirm the expected behavior from this PR. I ran QU240 with
writing restarts every month. For reference, I used
In case 2, the first entry differs from case 1 by being either zero or differing:
Running from this PR, case 3 and case 4 have identical output, identical to case 1 above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm approving based on @mark-petersen's testing and my own.
I've been testing with the new testdef and don't believe it's working correctly -- first, it doesn't seem to change "exclude_testing" for mpaso in env_archive.xml. However, even if I change that by hand, it still does not run cprnc on the conservation am file -- it might take some work to get testing to recognize that file |
@jonbob, that's odd. I think it changed the file for me. In:
I'm seeing:
as expected. As for the latter point. I wasn't sure how to find out what cprnc is checking. I guess we'll need @jgfouca's help to figure out another way. |
@xylar -- that's very odd... Anyway, Jim didn't see a way to do it with xmlchange either. |
@jonbob, I'll try it again. In the meantime, can you clarify what you are doing to see what files cprnc is looking at? |
I reset to the current state of this branch and I'm also seeing that the |
My fault. I accidentally had e3sm-unified loaded when I launched the test. It must have a better I'll have to do this more manually. |
@xylar -- there are files that get made for tests -- TestStatus and TestStatus.log. If I grep TestStatus.log for cprnc, I only see that being run on the cpl.hi files |
Thanks @jonbob. I'll try to debug some more tomorrow. |
@xylar -- I was going to recommend the same thing. I think it's more important to fix the functionality, and we can add a test once we get it working |
Don't try to do the sed magic and force the comparison but leave on the test that enables the conservation analysis members. |
@rljacob, okay, good suggestion. I have noted in the README for the test that it does not currently include MPAS-Ocean history files. I have given a manual change that folks could make to |
However, it shoudl be noted that MPAS-Ocean history files are not currently | ||
included in E3SM testing so non-BFB results will not be detected unless one | ||
manually changes to 'compname="mpaso" exclude_testing="false"' in the file | ||
cime_config/config_archive.xml. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one quick comment -- even changing exclude_testing won't check any of the mpaso history files. Plus there's a minor misspelling: "shoudl"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jonbob, okay, I understood that it would but I must be mistaken. I have removed this text and fixed the typo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xylar -- it should, but in practice none of the history files matches the file check in env_archive.xml. I did play around with it, and it works if we modify that file with this:
<comp_archive_spec compclass="ocn" compname="mpaso">
<rest_file_extension>rst</rest_file_extension>
<rest_file_extension>rst.am.timeSeriesStatsMonthly</rest_file_extension>
<hist_file_extension>hist.am.conservationCheck\..*\.nc$</hist_file_extension>
<rest_history_varname>unset</rest_history_varname>
where I added the conservationCheck line -- or actually replaced a line that had no effect. Now it actually does use the conservationCheck am for the test: this from TestStatus.log:
ERS_Ld5_D.T62_oQU240.GMPAS-IAF.chrysalis_intel.mpaso-conservation_check.20240925_131015_n3zwao.mpaso.hist.am.conservationCheck.0001-01-01.nc.base matched ERS_Ld5_D.T62_oQU240.GMPAS-IAF.chrysalis_intel.mpaso-conservation_check.20240925_131015_n3zwao.mpaso.hist.am.conservationCheck.0001-01-01.nc.rest
....
tail -n20 /lcrc/group/e3sm/ac.jwolfe/scratch/chrys/ERS_Ld5_D.T62_oQU240.GMPAS-IAF.chrysalis_intel.mpaso-conservation_check.20240925_131015_n3zwao/run/ERS_Ld5_D.T62_oQU240.GMPAS-IAF.chrysalis_intel.mpaso-conservation_check.20240925_131015_n3zwao.mpaso.hist.am.conservationCheck.0001-01-01.nc.base.cprnc.out
1 0.000000000000000E+00 0.000000000000000E+00
1 0.000000000000000E+00 0.000000000000000E+00
1 ( 1) ( 1)
avg abs field values: 0.000000000000000E+00
0.000000000000000E+00
************************************************************************************************************************************
SUMMARY of cprnc:
A total number of 290 fields were compared
of which 0 had non-zero differences
and 0 had differences in fill patterns
and 0 had different dimension sizes
and 0 had different data types
A total number of 5 fields could not be analyzed
A total number of 0 time-varying fields on file 1 were not found on file 2.
A total number of 0 time-constant fields on file 1 were not found on file 2.
A total number of 0 time-varying fields on file 2 were not found on file 1.
A total number of 0 time-constant fields on file 2 were not found on file 1.
diff_test: the two files seem to be IDENTICAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, clearly a bigger can of worms that remains beyond the scope of this PR. Thanks for being more thorough than I was and for keeping things honest.
@xylar -- I tested this in the shell_commands and it does give us what we want:
I'm not convinced we need to make the test completely functional for this PR, but I was curious about how difficult it would be... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approve the changes to the conservation check based on other's testing and visual inspection.
I'll defer to others on the added testing.
…t (PR #6643) Correct ocean conservation check settings Currently, the ocean conservation check analysis member overwrites the first entry in the file with a zero after restarts for some variables. This PR corrects this behavior so that the first day's entry in a monthly conservation check file is identical between continuous runs and a run with a restart break. Adds a new mpaso testdef and corresponding stealth test. Fixes #6642 [NML] for some mpaso resolutions [BFB]
Passes:
merged to next |
merged to master and expected NML DIFFs blessed |
This merge updates the E3SM-Project submodule from [727ad81](https://github.com/E3SM-Project/E3SM/tree/727ad81) to [1442143](https://github.com/E3SM-Project/E3SM/tree/1442143). This update includes the following MPAS-Ocean and MPAS-Frameworks PRs (check mark indicates bit-for-bit with previous PR in the list): - [ ] (ocn) E3SM-Project/E3SM#6509 - [ ] (ocn) E3SM-Project/E3SM#6508 - [ ] (fwk) E3SM-Project/E3SM#6575 - [ ] (ocn) E3SM-Project/E3SM#6590 - [ ] (fwk) E3SM-Project/E3SM#6643 - [ ] (ocn) E3SM-Project/E3SM#6656 - [ ] (ocn) E3SM-Project/E3SM#6672 - [ ] (ocn) E3SM-Project/E3SM#6659 - [ ] (ocn) E3SM-Project/E3SM#6497 - [ ] (ocn) E3SM-Project/E3SM#6485 - [ ] (ocn) E3SM-Project/E3SM#6566
Currently, the ocean conservation check analysis member overwrites the first entry in the file with a zero after restarts for some variables. This PR corrects this behavior so that the first day's entry in a monthly conservation check file is identical between continuous runs and a run with a restart break.
Fixes #6642
[NML] for some mpaso resolutions
[BFB] for all tested files