-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CICE in ESM1.5 can repeatedly use the same restart file #538
Comments
There is a couple of things we could do:
Other? |
Thanks @anton-seaice! It looks like payu is already making a new restart pointer in the setup stage, based on the latest Lines 166 to 187 in 27aac37
Do you think it would also work if we instead used the run start date when writing the restart pointer at the start of the run. I guess this similar to your second suggestion. |
I guess when payu finds the latest restart file, check that the filename matches the start date? Or possible better ... payu determines the correct filename based on the start date, and checks that the restart file exists before creating a ice restart pointer file (or checking the ice restart pointer has the same date in it ?) |
Sorry - we have to use the restart pointer file (my option 2. above is not easily possible) |
Looks like there are a couple of good options:
> cicefile = open(cicepath, 'rb')
> header = cicefile.read(24)
> bint, istep0, time, time_forc = struct.unpack('>iidd', header)
> print(time)
3155673600.0 @aidanheerdegen If you have any ideas or preferences, that would be really valuable! |
If you let me decided, I would say both. I think the question for Aidan is should we read the unformatted binary restart file and checking its value, or just assume that the restart filename is correct / consistent. Is there a case where folks intentionally need to restart from a different date than the model date ? (and don't change it in the binary restart file) |
Good point!
I suspect probably not. When the two don't line up it looks like things can go wrong with the history output, e.g. with the earlier example where it didn't write any history. |
Actually there is I think. When researchers are doing ensemble runs they sometimes grab restarts using a small time offset, or a time-offset of +/- 1 year, and so need to manipulate the date headers so they're correct. This is peripherally touched on this forum thread https://forum.access-hive.org.au/t/ensemble-runs-with-access-cm2/1107/3 I wrote a small fortran program for this purpose when I was working in the CMS team https://gist.github.com/aidanheerdegen/203af6f6e0a87d1d82704eae9608f099 because the models got out of synch if the time wasn't correct. I think it's a lot cleaner to just perturb the same restarts with some reproducible noise, so I don't think we have to support having incorrect dates in the restart files, and they can be changed in any case, especially if the expected values are printed to STDOUT. |
Is this bit feasible ? |
I think it's doable! It's slightly complicated by the
and if The best I could come up with is something like: dump_matches_end_date = False
dump_delta = dumpfreq * dumpfreq_n (as a relativedelta object)
dump_date = start_date
while dump_date < end_date:
if dump_date == expt_enddate:
dump_matches_end_date = True
dump_date = dump_date + dump_delta
if not dump_matches_enddate:
error out Would this sort of approach look ok? |
I think so, looks good :) |
Unfortunately checking the dump dates prior to the run looks a bit more complicated than I originally thought, as the way CICE4 chooses when to write dump files is different to what I'd expected. There are some ACCESS specific calendar calculations here, and it then sets whether to write a restart at a given time step here It looks like when E.g. If we run in monthly segments for 6 months, with
If the run is then continued for 1 year, with
I.e. it wrote a restart when it crossed into the new year, instead of at the end of the year-long simulation. Given the time constraints (with @aidanheerdegen finishing up for the year at the end of this week), what would you @anton-seaice and @aidanheerdegen think of deferring this additional check to next year, and including just the following checks in the current release:
I have a working version of the above updates with unit tests in #539, which would be ready for review if we're happy to delay the |
Yes thats sounds good. Ill review soon :) |
Awesome, thanks @anton-seaice! I'll make a separate issue for adding the |
In the ESM1.5 configurations, we changed the CICE
dumpfreq
parameter fromm
(monthly) toy
(yearly), reducing the number of restart files produced by CICE.We'd expect that running ESM1.5 in monthly segments would fail, as there would be no valid CICE restart files for subsequent runs.
This is not the case – instead CICE repeatedly uses the same restart files. The following shows the restart directories produced by
payu run -n 3
with a 1 month runtime:In each of the above the restart pointer file
ice.restart_file
points to the./iced.01010101
file.The
restart_date.nml
files do increment as they are calculated by payu:In the second and third run, no history is written (perhaps due to time mismatches between the restart file and the calendar file).
It looks like during
archive
, the cice driver reads the pointer fileice.restart_file
to determine the latesticed.YYYYMMDD
restart file, and deletes all others. Since CICE didn't update the pointer, I think the originaliced.01010101
is kept and repeatedly used.payu/payu/models/cice.py
Lines 315 to 327 in 27aac37
The text was updated successfully, but these errors were encountered: