-
Notifications
You must be signed in to change notification settings - Fork 5
Concurrency
NOTE: this design is not going to be implemented in this form. I just don't have the time and effort available to re-do the relevant bits in reactive form. A thread-based design was adopted instead in https://github.com/ewxrjk/rsbackup/commit/1561eb96bb4322bc30dbb1fe07b962a4bbbea8a8.
We currently parallelize removals, it would be nice to do the same for backups, liveness checking, etc.
The current order of things happening is found in https://github.com/ewxrjk/rsbackup/blob/master/src/MakeBackup.cc.
Backups first:
- live/HOST: Once per host, do a liveness check for the host.
- mounted/HOST/VOLUME: Once per volume, test whether the volume is mounted.
- flag/HOST/VOLUME: Once per volume, test whether the volume's flag file is present.
- Identify devices. This involves two steps...
- pre-access-hook: Once globally, run
pre-access-hook
. - check-stores: Once globally, check all store paths for valid devices.
- pre-backup-hook/HOST/VOLUME/DEVICE: Once per (volume, device), run
pre-backup-hook
. - backup/HOST/VOLUME/DEVICE: Once per (volume, device), make a backup.
- underway/HOST/VOLUME/DEVICE: Once per (volume, device), store an 'underway' backup result.
- post-backup-hook/HOST/VOLUME/DEVICE: Once per (volume, device), run
post-backup-hook
. - record-backup/HOST/VOLUME/DEVICE: Once per (volume, device), store the backup result.
Pruning:
- find-prunable: Identify prunable backups and update database.
- Identify devices. See 3-5 above.
- remove-prunable/HOST/VOLUME/DEVICE: Remove prunable backups.
- log-prunable: Update database
- expire-prune-logs: Expire pruning logs
Finally:
- post-access-hook: Once globally, run
post-access-hook
.
The step 8/10 behaviour is a bit odd. A backup can succeed but end up recorded as 'underway' only because post-backup-hook fails
, or not be recorded at all despite the possibility of an 'underway' state. This seems like a bug, leading to incomplete backups not reliably being pruned.
Work to do:
Expand retiry/pruning detailsDefine the ordering requirements between the individual actions.- Define the resources required by individual actions. NB may want to globally quiesce system for certain things e.g. pre-/post-backup-hook-* to reduce risk of LVM races.
ExpandActionList
to support the ordering requirements.- ...
- Profit!
- A < B means A must complete before B can start
- Normally A failing means B can't run; the exceptions are mentioned explicitly in the list below.
Actions not listed above:
- identify/HOST/VOLUME: figure out whether to backup this volume
- identify/HOST/VOLUME/DEVICE: ...to this device
- find-prunable/HOST/VOLUME: determine which backups are prunable
Relationships between actions:
- live/HOST < mounted/HOST/VOLUME;
- mounted/HOST/VOLUME < flag/HOST/VOLUME
- flag/HOST/VOLUME < identify/HOST/VOLUME
- pre-access-hook < check-stores
- check-stores < identify/HOST/VOLUME/DEVICE
- identify/HOST/VOLUME < identify/HOST/VOLUME/DEVICE
- identify/HOST/VOLUME/DEVICE < pre-backup-hook/HOST/VOLUME/DEVICE
- pre-backup-hook/HOST/VOLUME/DEVICE < backup/HOST/VOLUME/DEVICE
- backup/HOST/VOLUME/DEVICE < post-backup-hook/HOST/VOLUME/DEVICE (even if backup/ fails)
- post-backup-hook/HOST/VOLUME/DEVICE < record-backup/HOST/VOLUME/DEVICE (even if post-backup-hook/ fails)
- record-backup/HOST/VOLUME/DEVICE < find-prunable/HOST/VOLUME/DEVICE
- backup/HOST/VOLUME/DEVICE < post-access-hook (even if backup/ fails)
- check-stores < remove-prunable/HOST/VOLUME/DEVICE
- find-prunable/HOST/VOLUME/DEVICE < remove-prunable/HOST/VOLUME/DEVICE/DATE
- remove-prunable/HOST/VOLUME/DEVICE/DATE < log-prunable (even if remove-prunable/ fails)
- log-prunable < expire-prune-logs (even if log-prunable fails)
- remove-prunable/HOST/VOLUME/DEVICE/DATE < post-access-hook (even if remove-prunable/ fails)
If a parameterized relates to a less-parameterized name then the relation is implicitly true for all possible values of the missing parameter(s).
Maybe log-prunable can be split up a bit.
If the identify-* actions cause later actions to spring into existence then makes the successors of those actions tricky to evaluate. A way around this may be to have them "always" exist, implicitly, with the identify-* actions causing them to complete immediately; or alternatively for the output of the identify-* actions being a parameter controlling whether the successor actions do anything or complete immediately.
- https://github.com/ewxrjk/rsbackup/issues/17 - actually just about the way hooks are run
- https://github.com/ewxrjk/rsbackup/issues/26 - availability checking
- https://github.com/ewxrjk/rsbackup/issues/16 - also about hooks