Skip to content

Commit

Permalink
fix(checkstate): ignore and abort carryover changes (#415)
Browse files Browse the repository at this point in the history
Addresses #414.

Running checks have changes that get persisted by the state-engine. This
means that following a reboot, changes and tasks that are not complete
(not ready) will be resumed. In the case of some managers, such as
checkstate, carryover changes and tasks are an unwanted side effect.

Currently the plan manager will perform the first plan load very early
during startup (before the state-engine is ready, and before StartUp
hooks have been called). The result is that a PlanChanged propagation
will take place at plan load, and force checkstate to inspect the
running checks prematurely.

Checkstate discovers a running change from a previous boot context, and
tries to load its data from cached state, which does not exist.

Gracefully ignore changes for which cached state does not exist, and use
this to identify changes that should be aborted on the first ensure
pass.

Test:

```
<reboot>
59   Hold    today at 23:29 SAST  today at 23:37 SAST  Recover exec check "internet-online"
60   Error   today at 23:37 SAST  today at 23:37 SAST  Perform exec check "internet-online"
:
66   Doing   today at 23:37 SAST  -                    Recover exec check "internet-online"
<reboot>
:
66   Hold    today at 23:37 SAST  today at 23:41 SAST  Recover exec check "internet-online"
:
69   Error   today at 23:41 SAST  today at 23:42 SAST  Perform exec check "internet-online"
:
73   Doing   today at 23:42 SAST  -                    Recover exec check "internet-online"
```
As the test demonstrates, following a reboot, the change (66) is
aborted, and as a result the status is now ```Hold```. A new change is
created which starts as a ```Perform```, and then fails (as expected)
after a while and changes to a ```Recover```.
  • Loading branch information
flotter authored May 2, 2024
1 parent bb8b0a6 commit 02d7034
Showing 1 changed file with 8 additions and 1 deletion.
9 changes: 8 additions & 1 deletion internals/overlord/checkstate/manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,14 @@ func (m *CheckManager) PlanChanged(newPlan *plan.Plan) {
} else {
configKey = recoverConfigKey{change.ID()}
}
oldConfig := m.state.Cached(configKey).(*plan.Check) // panic if key not present (always should be)
v := m.state.Cached(configKey)
if v == nil {
// Pebble restarted, and this change is a carryover.
change.Abort()
shouldEnsure = true
continue
}
oldConfig := v.(*plan.Check)
existingChecks[oldConfig.Name] = true

newConfig, inNew := newPlan.Checks[details.Name]
Expand Down

0 comments on commit 02d7034

Please sign in to comment.