Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable controlling MOM6 ocean restart files with a more flexible approach #1976

Closed
BinLiu-NOAA opened this issue Nov 3, 2023 · 39 comments · Fixed by #2085
Closed

Enable controlling MOM6 ocean restart files with a more flexible approach #1976

BinLiu-NOAA opened this issue Nov 3, 2023 · 39 comments · Fixed by #2085
Assignees
Labels
enhancement New feature or request

Comments

@BinLiu-NOAA
Copy link
Contributor

Description

Currently, when running coupled ufs-weather-model configurations with ocean coupling (e.g. MOM6), users can only control the restart file output interval through, for example, restart_n = 24 and restart_option = nhours, meaning every 24 hours. In addition, MOM6 also write out the restart file at the end of the forecast.

However, for ocean data assimilation (DA) purpose, one would need output the MOM6 restart files at forecast hour 6 (or hours 3, 6, 9), providing first guess for DA.

Meanwhile, it is definitely too expensive (for IO bandwidth/time and disk space) to write out restart files every 6 hours through the restart_n=6 option throughout the forecast length. With that, it would be beneficial to make it somewhat flexible to control the MOM6 ocean restart file output frequency.

Solution

Improve the current method to enable somewhat more flexible method to control the MOM6 ocean restart file output frequency.

Alternatives

N/A.

Related to

@BinLiu-NOAA BinLiu-NOAA added the enhancement New feature or request label Nov 3, 2023
@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Nov 3, 2023

@BinLiu-NOAA I had some ideas about this after we talked. You wrote "restart files at forecast hour 6 (or hours 3, 6, 9)" , could you clarify this? Basically, you need the restart files 3hrs after initialization, 6hrs after initialization and 9 hours after initialization, right? So if you start at hour 18, you need hours 21,24 and 03 (next day). Is that right?

After that, every 6 hours is needed.

@BinLiu-NOAA
Copy link
Contributor Author

@DeniseWorthen, ideally we would like to output the MOM6 ocean restart files at forecast hours at 6 (for ocean DA purpose at this point, at forecast hours 3, 6, 9 will be better and required in the future) then forecast hours at 24, 48, 72, 96, 120 (for warm-starting the forecast capability). Thanks!

@DeniseWorthen
Copy link
Collaborator

@BinLiu-NOAA I've been able to start prototyping this in CMEPS (alarm initialization is similar in all the non-fv3 component caps). For the case you've listed as 24, 48, 72, 96, 120, would you also need the capability to have interval restarts after hour=120?

@BinLiu-NOAA
Copy link
Contributor Author

@BinLiu-NOAA I've been able to start prototyping this in CMEPS (alarm initialization is similar in all the non-fv3 component caps). For the case you've listed as 24, 48, 72, 96, 120, would you also need the capability to have interval restarts after hour=120?

Thanks, @DeniseWorthen! In addition to the forecast hour 6 (or 3, 6, 9 hours, or even 3, 4, 5, 6, 7, 8, 9 hours, which are mainly for data assimilation purposes), for the listed forecast hours 24, 48, 72, 96, 120 for restart output files, they are just example forecast hours. It basically means every 24 hours (or 48 hours, or whatever frequency the user/application wants). I believe, MOM6 currently also automatically write out the restart files at the end of the forecast.

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Dec 21, 2023

@BinLiu-NOAA Please try my feature branch in MOM6 (https://github.com/DeniseWorthen/MOM6/tree/feature/restartfh).

To use, add a config variable to your OCN attributes listing the forecast hours you want additional restarts at. For example, the below setting of 3,9,15 will write restarts after 3,9 and 15 hours in addition to the restarts written at set intervals.

# OCN #
OCN_model:                      mom6
  (snip)
  restart_fh = 3,9,15
::

For the RT case starting on 2021-3-22-06 the above setting will write non-interval restarts at

20231221 054819.106 INFO             PET150 MOM_cap:(ModelSetRunClock) Restart_Fh at 2021  3 22  9  0  0   0
20231221 054819.106 INFO             PET150 MOM_cap:(ModelSetRunClock) Restart_Fh at 2021  3 22 15  0  0   0
20231221 054819.106 INFO             PET150 MOM_cap:(ModelSetRunClock) Restart_Fh at 2021  3 22 21  0  0   0

These restarts are in addition to those written out on the interval defined with restart_n and restart_option.

To test, I confirmed that the non-interval restarts are identical to those written out by setting restart_n=3.

@BinLiu-NOAA
Copy link
Contributor Author

Thanks @DeniseWorthen! @JohnSteffen-NOAA, @binli2337, and @YongzuoLi-NOAA, let's test this new function for HAFS MOM6 coupling and MOM6-3DVAR to confirm it works properly. Thanks!

@BinLiu-NOAA
Copy link
Contributor Author

Also, @jiandewang for your information about this ongoing work. Thanks!

@jiandewang
Copy link
Collaborator

Also, @jiandewang for your information about this ongoing work. Thanks!

yes I saw that. Many thanks for @DeniseWorthen work on this highly demanded features. This feature will also be useful for DA when using S2SW. Let me also make a try using S2SW

@YongzuoLi-NOAA
Copy link

YongzuoLi-NOAA commented Dec 27, 2023 via email

@jiandewang
Copy link
Collaborator

I tried it (with all sorts of different restart file output hours) using one of S2S case as template and confirm that it worked perfectly

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Jan 3, 2024

I'm going to note here (just so I can reference the information at some point), that only real difficulty implementing this was understanding that the config variables could only be accessed as character strings. Google then provided me w/ the method to convert the character string to a comma-delimited integer array.

The limitation as character strings is I believe the result of using, in UFSDriver.F90, NUOPC_CompAttributeIngest to set the component attributes. The NUOPC ref notes that

Important: Attributes ingested by this method are stored as type character strings, and must be accessed accordingly. Conversion from string into a different data type, e.g. integer or real, is the user's responsibility.

@DeniseWorthen
Copy link
Collaborator

@BinLiu-NOAA Will you be expecting to have this feature committed to the MOM6 repo for your implementation?

@BinLiu-NOAA
Copy link
Contributor Author

@BinLiu-NOAA Will you be expecting to have this feature committed to the MOM6 repo for your implementation?

@DeniseWorthen, this feature (flexible MOM6 restart output hours) is not absolutely needed for HAFSv2 upgrade (code freeze end of January, 2024). However, we plan to use this feature in 2024 HAFS real-time parallel experiments (needed in April-June time frame).

With that, if this feature is ready, and the change can be committed back to MOM6 branch used by ufs-weather-model, then it makes sense to me to create PRs and bring in this capability into MOM6 cap and ufs-weather-model (earlier is better of course).

@JohnSteffen-NOAA and @YongzuoLi-NOAA, wondering if you have get a chance to test this capability.

After that, @jiandewang and @DeniseWorthen, feel free to go ahead to plan the PR and commit process. Thanks!

@DeniseWorthen
Copy link
Collaborator

@BinLiu-NOAA Thanks for the info. We need to coordinate w/ @jiandewang since I think he has at least one big MOM6 PR that we're waiting to be able to update with before we can push back any changes to GFDL. Jiande, would adding this to the existing changes (the cesm-style names) before pushing back make sense?

@jiandewang
Copy link
Collaborator

@BinLiu-NOAA Thanks for the info. We need to coordinate w/ @jiandewang since I think he has at least one big MOM6 PR that we're waiting to be able to update with before we can push back any changes to GFDL. Jiande, would adding this to the existing changes (the cesm-style names) before pushing back make sense?

@DeniseWorthen GFDL has not been able to figure out the cause for the failure of retain b4b on wcoss2 at this moment. Let's see if they ahve any update in today's MOM6 meeting. For your flexiable restart writing code, it make sense to add to the current dev/emc and I will push back to main at certain stage (hard to tell when I shall do that because of the big PR issue on wcoss2). Note I just created a mini PR (NOAA-EMC/MOM6#124) which needs to go into dev/emc.

@jiandewang
Copy link
Collaborator

jiandewang commented Jan 8, 2024

@DeniseWorthen let me asking NCAR side to have a try on your branch to see if they have any comments (in the final bi-wekly meeting of 2023 ecah group shared their idea and comments on how to make MOM6 PR go sommth in year 2024. It is mentioned that NCAR and EMC will have pre-test before initializing its PR)

@DeniseWorthen
Copy link
Collaborator

@jiandewang I think if we're going to go ahead and push this, I'd like to make a modification before you send it off to ncar. Could you do a quick test after I make the mod to verify it still compiles and runs?

@jiandewang
Copy link
Collaborator

@jiandewang I think if we're going to go ahead and push this, I'd like to make a modification before you send it off to ncar. Could you do a quick test after I make the mod to verify it still compiles and runs?

sure I will do that (HERA is down today but I can try on other machine).

@DeniseWorthen
Copy link
Collaborator

DeniseWorthen commented Jan 8, 2024

@jiandewang Thanks. I pushed the change, which is just to ensure that if the esmf alarm calls returns an error, it will be caught correctly. I've checked that it compiles and it should also work.

@jiandewang
Copy link
Collaborator

@DeniseWorthen HERA was totally full yesterday so I ranon GAEA. It works fine. Let me asking NCAR for a test

@JohnSteffen-NOAA
Copy link

@BinLiu-NOAA @YongzuoLi-NOAA @DeniseWorthen

I was able to test the user-specified MOM6 restart capability within the HAFS framework and it works as expected.

The test used the HAFSv2 baseline branch and substituted Denise's MOM6 fork of feature/restartfh before building and running HAFS.

The ufs.configure.mom6.tmp file in the /parm/forecast/regional/ directory was modified to include the ocean attribute "restart_fh = 3,6,9,24".

The cronjob_hafsv2a_baseline.sh was modified to run 27-hour forecasts of the 13L Laura test case for two cycles, 2020082506 and 2020082512.

Output on Orion can be found here:
/work/noaa/hwrf/scrub/jsteff/hafsv2a_baseline_restartfh/2020082512/13L/forecast/RESTART

20200825.150000.MOM.res.nc 20200825.210000.MOM.res.nc
20200825.180000.MOM.res.nc 20200826.120000.MOM.res.nc

@YongzuoLi-NOAA
Copy link

YongzuoLi-NOAA commented Jan 11, 2024 via email

@BinLiu-NOAA
Copy link
Contributor Author

Thanks, @JohnSteffen-NOAA! Great to know it works fine within the HAFS application/workflow properly as well.

With that, @DeniseWorthen and @jiandewang, please feel free to help to plan/coordinate the merge of this development back so that ufs-weather-model develop branch can use this feature. Much appreciated!

@DeniseWorthen
Copy link
Collaborator

A question came up in conversation w/ NCAR about MOM6 history files. How (if at all) are you using MOM6 history files when this feature is active? That is, are the history files correctly averaged when restarting at an arbitrary hour?

@jiandewang
Copy link
Collaborator

copy and paste NCAR's comments here for record:

  • In the event that the MOM6 timesteps don't align with the specified restart intervals, like if the user sets restart_fh to "3,9,15" but the coupling timestep is, let's say, 4 hours, what would be the outcome?
  • My understanding is that the FMS diagnostics module is unable to accurately handle history averaging when the model restarts from a time other than 0:00. How do you address that?

Our tests are passing.I also confirmed that the restart files don't get recorded if the intervals are not aligned with the coupling timestep. So perhaps a warning/error may be added to avoid user confusion.

@jiandewang
Copy link
Collaborator

@DeniseWorthen will you be able to add a warning message Alper mentioned here ? Will this work ?
if ( mod(restart_fh, coupling_timesetp) /= 0 ) .........

@DeniseWorthen
Copy link
Collaborator

Yes, I can obtain the coupling_timestep from the clock.

@jiandewang
Copy link
Collaborator

Yes, I can obtain the coupling_timestep from the clock.

use force push to avoid extra git commit history (that's what MOM group prefer)

@BinLiu-NOAA
Copy link
Contributor Author

@DeniseWorthen and @jiandewang, for HAFS FV3ATM-MOM6 coupling, our coupling time step is 6 mins (currently), and we definitely make sure the output history and restart files are divisible by 6 mins. Meanwhile, for HAFS MOM6 history output, we also choose instantaneous fields instead of time-averaged fields. Hope these information might be useful. Thanks!

@DeniseWorthen
Copy link
Collaborator

@jiandewang I need to re-think my respond to Alper's comment about cases where this would not correctly add the extra restarts. We actually control restarts via restart_n and restart_option. You can set those to values which are not multiples of the coupling frequency already---for example, write a MOM6 restart every 90mins (restart_n=90,restart_option=nminutes) when you couple on the hour. So really you could argue that we need a warning message in that case too.

In this case, I've hard-coded the restart_fh to be in hours, so maybe a message makes sense. But you can also imagine a case (with either restart_n or restart_fh) where you set them such that MOM6 hasn't completed it's baroclinic/barotropic timesteps. We've always ensured everything aligns (including span_coupling false), but it is really up to the user to know how to set up their case.

@DeniseWorthen
Copy link
Collaborator

@jiandewang I've updated the message two ways; I've written it to the stdout log instead of the PET log (since PET logs are most likely off) and added a note for when the extra ones won't be written. I think this is good to go now.

@jiandewang
Copy link
Collaborator

@DeniseWorthen let me make a test run and get back to you

@jiandewang
Copy link
Collaborator

@DeniseWorthen works as expected, now I see the follwoing in the "out" file if I set restart_fh = 8,9,10,11,12,20
150: (MOM_cap:ModelAdvance) writing restart file 20210322.140000.MOM.res
150: (MOM_cap:ModelAdvance) writing restart file 20210322.150000.MOM.res
150: (MOM_cap:ModelAdvance) writing restart file 20210322.160000.MOM.res
150: (MOM_cap:ModelAdvance) writing restart file 20210322.170000.MOM.res
150: (MOM_cap:ModelAdvance) writing restart file 20210322.180000.MOM.res
150: (MOM_cap:ModelAdvance) writing restart file 20210323.020000.MOM.res
150: (MOM_cap:ModelAdvance) writing restart file 20210323.060000.MOM.res

@YongzuoLi-NOAA
Copy link

YongzuoLi-NOAA commented Jan 16, 2024 via email

@YongzuoLi-NOAA
Copy link

I will perform HAFS-JEDI MOM6-3DVAR cycles with restart_fh.

@jiandewang
Copy link
Collaborator

@YongzuoLi-NOAA why do you need to combine small files to a big file as MOM6 can read in small files ? also 5x2=10G. I assume combination will takes time as the file size is big

@YongzuoLi-NOAA
Copy link

@jiandewang Thanks for letting me know that MOM6 can read multiple MOM.res files as input. HAFS-JEDI MOM6-3DVAR read one MOM.res.nc before. I am not sure how to write yaml for multiple MOM.res files. Here is JEDI 3DVAR yaml
background:
read_from_file: 1
basename: ./restarts/
ocn_filename: MOM.res.nc
date: DATE
state variables: [hocn, socn, tocn, ssh, mld, layer_depth]

@jiandewang
Copy link
Collaborator

@YongzuoLi-NOAA I see and I guess you have to combine them for your case

@YongzuoLi-NOAA
Copy link

@jiandewang Thank you for the discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment