Skip to content

Commit

Permalink
Merge pull request #573 from feathern/checktime2
Browse files Browse the repository at this point in the history
Checkpointing interval can now be specified in minutes.
  • Loading branch information
jorafb authored Aug 16, 2024
2 parents 6e4f07b + feb4810 commit cbdfecc
Show file tree
Hide file tree
Showing 4 changed files with 72 additions and 21 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@
Number of iterations between successive checkpoint outputs. Default value is -1 (no checkpointing).
**check_frequency**
(deprecated) Same as checkpoint_interval.
**checkpoint_minutes**
Time in minutes between successive numbered checkpoints. If this variable is set to a positive value (default is -1), the value of checkpoint_interval will be ignored.
**quicksave_interval**
Number of iterations between successive quicksave outputs. Default value is -1 (no quicksaves).
**num_quicksaves**
Expand Down
22 changes: 21 additions & 1 deletion doc/source/User_Guide/run_rayleigh.rst
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ points for a given run. These files begin with an 8-digit prefix
indicating the time step at which the checkpoint was created.

The frequency with which standard checkpoints are generated can be
controlled by modifying the **checkpoint_interval`` variable in the
controlled by modifying the **checkpoint_interval** variable in the
``temporal_controls_namelist``. For example, if you want to generate a
checkpoint once every 50,000 time steps, you would modify your
``main_input`` file to read:
Expand All @@ -166,6 +166,17 @@ checkpoint once every 50,000 time steps, you would modify your
The default value of checkpoint_interval is 1,000,000, which is
typically much larger than what you will use in practice.

Alternatively, you can specify the interval in minutes between which successive checkpoints are written.
To do so, set the ``checkpoint_minutes`` variable:

::

&temporal_controls_namelist
checkpoint_minutes= 30.0d0 ! Save a checkpoint once every half hour.
/

If the ``checkpoint_minutes`` variable is set to a positive value in main_input, any value set for ``checkpoint_interval`` will be ignored.

Restarting from a checkpoint is accomplished by first assigning a value
of -1 to the variables ``init_type`` and/or ``magnetic_init_type`` in
the ``initial_conditions_namelist``. In addition, the time step from
Expand Down Expand Up @@ -261,6 +272,15 @@ the quicksave_02 files will be overwritten, and so forth. Because the
``num_quicksaves`` was set to 2, filenames with prefix quicksave_03 will
never be generated.

As with numbered checkpoints, the number of minutes between successive quicksaves
can be specified as an alternative to quicksave_interval. To do so, set the ``quicksave_minutes`` variable:

::

&temporal_controls_namelist
quicksave_minutes= 15.0d0 ! Create a quicksave once every 15 minutes.
/

Note that checkpoints beginning with an 8-digit prefix (e.g., 00035000)
are still written to disk regularly and are not affected by the
quicksave checkpointing. On time steps where a quicksave and a standard
Expand Down
66 changes: 47 additions & 19 deletions src/Physics/Checkpointing.F90
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,11 @@ Module Checkpointing
Logical :: ItIsTimeForACheckpoint = .false.
Logical :: ItIsTimeForAQuickSave = .false.
Integer :: quicksave_num = -1
Real*8 :: checkpoint_t0 = 0.0d0
Real*8 :: checkpoint_elapsed = 0.0d0 ! Time elapsed since checkpoint_t0
Real*8 :: checkpoint_t0 = 0.0d0 ! Time of last checkpoint

Real*8 :: quicksave_t0 = 0.0d0 ! Time of last quicksave
Real*8 :: quicksave_seconds = -1 ! Time between quick saves
Real*8 :: checkpoint_seconds = -1 ! Time between checkpoints

Type(Cheby_Transform_Interface) :: cheby_info

Expand Down Expand Up @@ -89,6 +91,11 @@ Subroutine Initialize_Checkpointing()
quicksave_seconds = quicksave_minutes*60
quicksave_interval = -1
Endif

If (checkpoint_minutes .gt. 0) Then
checkpoint_seconds = checkpoint_minutes*60
checkpoint_interval = -1
Endif

numfields = 4 + n_active_scalars + n_passive_scalars
if (magnetism) then
Expand Down Expand Up @@ -797,18 +804,42 @@ End Subroutine Read_Checkpoint
Subroutine IsItTimeForACheckpoint(iter)
Implicit None
Integer, Intent(In) :: iter
Real*8 :: elapsed_seconds

! A checkpoint or quicksave is triggered when the specified
! number of timesteps or seconds has passed.
! Only one numbered checkpoint or one quicksave are written.
! Numbered checkpoints take precedence.

ItIsTimeForACheckpoint = .false.
ItIsTimeForAQuickSave = .false.
If (Mod(iter,checkpoint_interval) .eq. 0) Then
ItIsTimeForACheckpoint = .true.
checkpoint_t0 = checkpoint_elapsed ! quicksaves not written
checkpoint_elapsed = 0.0d0
!If the long interval check is satisfied, nothing,
! nothing related to the short interval is executed.
Else


! First, check to see if we should write a numbered checkpoint.

! Time since last numbered checkpoint
elapsed_seconds = global_msgs(2) - checkpoint_t0

If (checkpoint_seconds .gt. 0) Then

If (elapsed_seconds .gt. checkpoint_seconds) Then
checkpoint_t0 = global_msgs(2)
ItIsTimeForACheckpoint = .true.
Endif

Else If (checkpoint_interval .gt. 0) Then

If (Mod(iter,checkpoint_interval) .eq. 0) Then
ItIsTimeForACheckpoint = .true.
checkpoint_t0 = global_msgs(2)
Endif

Endif

! If a numbered checkpoint was not written, check to
! see if a quicksave should be written.
If (.not. ItIsTimeForACheckpoint) Then

!Check for quick-save status. This will be based on either iteration #
! OR on the time since the last checkpoint

If (quicksave_interval .gt. 0) Then
If (Mod(iter,quicksave_interval) .eq. 0) Then
Expand All @@ -820,19 +851,16 @@ Subroutine IsItTimeForACheckpoint(iter)
Endif
Endif

If (quicksave_seconds .gt. 0) Then
checkpoint_elapsed = global_msgs(2) - checkpoint_t0
If (checkpoint_elapsed .gt. quicksave_seconds) Then

checkpoint_t0 = global_msgs(2)
checkpoint_elapsed = 0.0d0
If (quicksave_seconds .gt. 0) Then
!Time since last quicksave
elapsed_seconds = global_msgs(2) - quicksave_t0
If (elapsed_seconds .gt. quicksave_seconds) Then
quicksave_t0 = global_msgs(2)
ItIsTimeForACheckpoint = .true.
ItIsTimeForAQuickSave = .true.
quicksave_num = quicksave_num+1
quicksave_num = Mod(quicksave_num,num_quicksaves)

Endif

Endif

Endif
Expand Down
3 changes: 2 additions & 1 deletion src/Physics/Controls.F90
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ Module Controls
Integer :: quicksave_interval = -1 ! Number of iterations between quicksave dumps
Integer :: num_quicksaves = 3 ! Number of quick-save checkpoints to write before rolling back to #1
Real*8 :: quicksave_minutes = -1.0d0 ! Time in minutes between quick saves (overrides quicksave interval)
Real*8 :: checkpoint_minutes = -1.0d0 ! Time in minutes between checkpoints (overrides quicksave interval)

Real*8 :: cflmax = 0.6d0, cflmin = 0.4d0 ! Limits for the cfl condition
Real*8 :: max_time_step = 1.0d0 ! Maximum timestep to take, whatever CFL says (should always specify this in main_input file)
Expand All @@ -118,7 +119,7 @@ Module Controls
& cflmax, cflmin, max_time_step, diagnostic_reboot_interval, min_time_step, &
& num_quicksaves, quicksave_interval, checkpoint_interval, quicksave_minutes, &
& max_time_minutes, save_last_timestep, new_iteration, save_on_sigterm, &
& max_simulated_time
& max_simulated_time, checkpoint_minutes



Expand Down

0 comments on commit cbdfecc

Please sign in to comment.