-
Notifications
You must be signed in to change notification settings - Fork 381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vbp: Rework probe state management #4115
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a first review this looks good to me.
What I really appreciate is the first commit acting as an assessment of the current informal state machine, with a diff showing the root cause of #4108 in the third commit.
I have one nitpick and a cosmetic suggestion.
@dridi I took your suggestions for the next force-push, thank you! |
another dridification round done |
During extended bugwash, @bsdphk suggested adding a cluster subgraph to the graphviz file to mark possible states while a probe task is running, but I could not make it become an improvement to clarity diff --git a/doc/graphviz/cache_backend_probe.dot b/doc/graphviz/cache_backend_probe.dot
index 9f289b162..6b93e2b6b 100644
--- a/doc/graphviz/cache_backend_probe.dot
+++ b/doc/graphviz/cache_backend_probe.dot
@@ -3,9 +3,11 @@
digraph cache_backend_probe {
ALLOC
scheduled
- running
+ subgraph cluster_vbptask {
+ running
+ cooling # going cold while task runs
+ }
cold
- cooling # going cold while task runs
deleted # from cooling, removed while task runs
FREE
I did add a commit with a rename he suggested. |
ftr, this was approved by today's bugwash |
The state is not used yet other than for assertions.
FTR: The first dridification round had introduced a regression. The change was: @@ -457,20 +457,16 @@ vbp_task_complete(struct vbp_target *vt)
assert(vt->heap_idx == VBH_NOIDX);
- if (vt->state == vbp_state_scheduled) {
- WRONG("vbp_state_scheduled");
- } else if (vt->state == vbp_state_running) {
+ if (vt->state == vbp_state_running) {
vt->state = vbp_state_scheduled;
vt->due = VTIM_real() + vt->interval;
vbp_heap_insert(vt);
vt = NULL;
- } else if (vt->state == vbp_state_cold) {
- WRONG("vbp_state_cold");
} else if (vt->state == vbp_state_cooling) {
vt->state = vbp_state_cold;
vt = NULL;
} else {
- assert(vt->state == vbp_state_deleted);
+ WRONG(vt->state->name);
}
return (vt);
} here, I missed to accept |
Every time I looked at the probe code, my mind ended up twisted and confused. A probe could change the "enabled" state (tracking the temperature) and be removed at any time (unless the mtx is held), yet the code did not seem to reflect this. We un-twist my mind by completing the transition to probe states and adding a chain of two states for the case that a probe is controlled/deleted while its task is running: cooling: running probe disabled deleted: running probe removed (while cooling only) With this new scheme, we can now have (I think) a clean state diagram (see dot file): - a probe begins in the cold state - from cold, it can either get removed or scheduled via VBP_Control() - from scheduled, it can go back to cold (via VBP_Control()) or be picked up by vbp_thread() to change to running (aka task started) - once the task finishes, it normally goes back to scheduled, but in the meantime it could have changed to cooling or deleted, so vbp_task_comple() hadles these cases and either transitions to cold or deletes the probe - if the task can not be scheduled, the same handling happens We now also remove running probes from the binheap to remove complexity. Fixes varnishcache#4108 for good
This background thread does not run the actual probes, but schedules tasks which do (vbp_task). Rename suggested by phk.
FWIW, I'm (rarely) getting the following on reload (trying to create a MWE, but it's difficult):
Also got this once (same backtrace, but the state this time is scheduled):
( Edit: finally managed to track it down to a MWE. See #4199 |
I will be away until July, but I would appreciate reviews and collection of feedback in the meantime. Also, for the unlikely event that everyone is absolutely happy about it, I would also not oppose a merge.
Every time I looked at the probe code in the past, my mind ended up twisted and confused. A probe could change the "enabled" state (tracking the temperature) or be removed at any time (unless the
mtx
is held), yet the code did not seem to reflect this.We un-twist my mind by implementing probe states:
cold
: reflects cold backend temperaturescheduled
: probe is on binheap, waiting for its time to comerunning
: a task has been started to run the probecooling
: running probe disableddeleted
: running probe removed (whilecooling
only)With this new scheme, we can now have (I think) a clean state diagram (see dot file):
cold
statecold
, it can either get removed or scheduled viaVBP_Control(..., 1)
scheduled
, it can go back tocold
(viaVBP_Control(..., 0)
) or be picked up byvbp_thread()
to change torunning
(aka task started)scheduled
, but in the meantime it could have changed tocooling
or further on todeleted
, sovbp_task_comple()
handles these cases and either transitions tocold
or deletes the probeWe now also remove running probes from the binheap to remove complexity. I am not entirely sure if there could have been a good reason for keeping running probes on the binheap, so if this is the case, we might want to reconsider this change. But it is not obvious to me how deleting and reinserting just to delete and reinsert later should be better than deleting and reinserting later.
Written to fix #4108 for good