Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cache_vcl: Do the temperature dance for backend creation races #4205

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nigoroll
Copy link
Member

Dynamic backends may get created while we transition a VCL to the WARM temperature: Because we first set the new temperature and then post the event, backends might get created when the temperature is already WARM, but before the VCL_EVENT_WARM gets posted, which leads to double warming. This can be demonstrated with an appropriate delay during vcl_send_event().

The solution to this problem is simple: Only post WARM events to backends from before the start of the transition.

In code, it looks a little more involved, but it is not complicated:

Under the vcl mutex, we take all directors onto a separate "cold" list and change the temperature, and send the warm event only to the "cold" directors. When all are warm, we concatenate.

To undo, we basically do the inverse: Save the warm directors, put the cold ones back, send a cold event to all warm ones and concatenate again.

Fixes #4199

@nigoroll
Copy link
Member Author

nigoroll commented Sep 30, 2024

Sorry, if anyone was real quick they might have caught a little glitch. I force-pushed a correction.

Copy link
Member

@dridi dridi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if @AlveElde could also have a look since he spent time on the temperature state machine.

bin/varnishd/cache/cache_vcl.h Outdated Show resolved Hide resolved
Dynamic backends may get created while we transition a VCL to the WARM
temperature: Because we first set the new temperature and then post the event,
backends might get created when the temperature is already WARM, but before the
VCL_EVENT_WARM gets posted, which leads to double warming. This can be
demonstrated with an appropriate delay during vcl_send_event().

The solution to this problem is simple: Only post WARM events to backends from
before the start of the transition.

In code, it looks a little more involved, but it is not complicated:

Under the vcl mutex, we take all directors onto a separate "cold" list and
change the temperature, and send the warm event only to the "cold" directors.
When all are warm, we concatenate.

To undo, we basically do the inverse: Save the warm directors, put the cold ones
back, send a cold event to all warm ones and concatenate again.

Fixes varnishcache#4199
@nigoroll nigoroll force-pushed the directors-dancing-on-a-peltier-element branch from 62db805 to f2ceb35 Compare October 7, 2024 09:35
@nigoroll
Copy link
Member Author

nigoroll commented Oct 7, 2024

bugwash: leave more elaborate explanation on the race here

@nigoroll
Copy link
Member Author

nigoroll commented Oct 7, 2024

@bsdphk as promised, the more elaborate explanation of the race:

Lck_Lock(&vcl_mtx);
vcl->temp = VCL_TEMP_WARM;
Lck_Unlock(&vcl_mtx);

when, after this point, VRT_AddDirector() gets called from another thread, the new director is added to the director list and the first VCL_EVENT_WARM is posted here:

if (temp != VCL_TEMP_COOLING)
VTAILQ_INSERT_TAIL(&vcl->director_list, vdir, list);
if (temp->is_warm)
VDI_Event(vdir->dir, VCL_EVENT_WARM);

Then, the second event gets posted from here:

vcl_BackendEvent(vcl, VCL_EVENT_WARM);

Regarding your question on IRC:

(15:50:38) phk: also, new backends are added to the tail of the list, so we could just break the loop first time we see an already warm backend, no ?

Yes, I think this would be possible, if we added the current temperature to directors. Yet it would conflict with another goal, namely to not hold the vcl_mtx while posting the actual event in vcl_BackendEvent(). The reason for this is #4140, addressed by #4142: We have a deadlock-by-design issue when deleting dynamic backends.

So, in short: Yes, we could avoid this problem dancing less but deadlocking more.

Comment on lines 642 to 645
Lck_Lock(&vcl_mtx);
VTAILQ_SWAP(&vcl->director_list, &colddirs, vcldir, list);
vcl->temp = VCL_TEMP_WARM;
Lck_Unlock(&vcl_mtx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to empty the vcl->director_list at any point? If a VMOD makes a VRT_DelDirector() call for a director that existed before this list swap, the VTAILQ_REMOVE() in vcldir_retire() does nothing and we get a use-after-free when swapping the list back:

Lck_Lock(&vcl_mtx);
temp = vdir->vcl->temp;
VTAILQ_REMOVE(&vdir->vcl->director_list, vdir, list);
Lck_Unlock(&vcl_mtx);

Maybe I am missing a step in the temperature dance that would make this scenario impossible, but I think it at least breaks the assumption that once a director is added to the director_list it will be there until you remove it (and its refcount reaches 0).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @AlveElde, you are right that this is brittle.

Because VTAILQ_REMOVE(head, elm, field) only touches head if elm is last in the list, this would only affect the case where the last director added before the swap is removed after the concat, but yes, it is another race. Less relevant, but still a race.

This is not the only place where removing directors is a problem, which is why I suggested a solution in #4142.

Maybe it's best to bring that one in first and then update this PR accordingly.

@nigoroll nigoroll marked this pull request as draft October 10, 2024 09:49
@nigoroll
Copy link
Member Author

Set to draft because #4142 should be done first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panic on vcl.load: Wrong turn at cache/cache_backend_probe.c:710: running
3 participants