Skip to content
This repository has been archived by the owner on Jun 18, 2024. It is now read-only.

scx: Add hotplug sequence number #179

Merged
merged 5 commits into from
Apr 11, 2024
Merged

scx: Add hotplug sequence number #179

merged 5 commits into from
Apr 11, 2024

Conversation

Byte-Lab
Copy link
Collaborator

We currently have a possibly tricky race w.r.t. hotplug that schedulers
don't have a good way to account for. Once a scheduler has inspected a
host topology, if a hotplug event occurs before a scheduler is attached
and loaded, then the scheduler will have no way of knowing that its view
of the host topology is incorrect. Hotplug events after this are fine,
as we'll either pass the events to the scheduler, or evict the scheduler
directly. But if a hotplug event happens between inspecting the host
topology and attaching the scheduler, we have a problem.

This series addresses this by adding the capability to track a
monotonically increasing sequence number across hotplug events.

Update the copyright in a selftest, and make a comment for an exit_code
field a bit more generic to reflect that exit_code can be defined when
gracefully exiting from the main kernel, not just BPF. Finally, update a
pr_err message to print the correct path to the sched_ext sysfs node.

Signed-off-by: David Vernet <[email protected]>
@Byte-Lab Byte-Lab requested a review from htejun April 11, 2024 04:14
@Byte-Lab Byte-Lab force-pushed the hotplug_final_pieces branch 2 times, most recently from a1ec31f to d7afe8f Compare April 11, 2024 17:12
@Byte-Lab
Copy link
Collaborator Author

Latest push was to update the SCX_HOTPLUG_SQN() macro to be a static inline function instead.

kernel/sched/ext.c Outdated Show resolved Hide resolved
kernel/sched/ext.c Outdated Show resolved Hide resolved
kernel/sched/ext.c Outdated Show resolved Hide resolved
tools/testing/selftests/sched_ext/hotplug.c Outdated Show resolved Hide resolved
We currently have a possibly tricky race w.r.t. hotplug that schedulers
don't have a good way to account for. Once a scheduler has inspected a
host topology, if a hotplug event occurs before a scheduler is attached
and loaded, then the scheduler will have no way of knowing that its view
of the host topology is incorrect. Hotplug events _after_ this are fine,
as we'll either pass the events to the scheduler, or evict the scheduler
directly. But if a hotplug event happens between inspecting the host
topology and attaching the scheduler, we have a problem.

To address this, we can use a monotonically increasing hotplug sequence
number that is incremented any time a hotplug event occurs, and expose
it through a sysfs node in /sys/kernel/sched_ext/. Using this, a user
space scheduler can look at the sequence number before loading, and then
compare it to the sequence number during attach to see if a hotplug
event occurred. If so, we can fail to attach, and return to user space.

This patch adds the aforementioned sysfs node. A subsequent patch will
update the struct sched_ext_ops and the attach path to check this value
to ensure that a hotplug event hasn't occurred.

Signed-off-by: David Vernet <[email protected]>
@Byte-Lab Byte-Lab force-pushed the hotplug_final_pieces branch from d7afe8f to bef0337 Compare April 11, 2024 19:54
We'll need to have a hotplug sequence number in struct sched_ext_ops if
we want to enable user space to deterministically detect a hotplug event
between reading a host's topology, and attaching its scheduler.

A prior change added a global hotplug sequence number and exported it
through a sysfs file. This one connects the two by also adding logic to
fail to attach if there is a mismatch between the two. A subsequent
patch will add tests.

Signed-off-by: David Vernet <[email protected]>
Now that we have the hotplug sequence number, schedulers can set the
sequence number when opening the skeleton to detect hotplug events. In
order to provide backwards compatibility and avoid excess boilerplate,
let's add a new SCX_OPS_OPEN() macro that encapsulates this for the
caller.

In addition, we add an SCX_HOTPLUG_SQN() macro that can be used to read
the current global sequence number from
/sys/kernel/sched_ext/hotplug_sqn. This is called by SCX_OPS_OPEN() when
running on a kernel with hotplug sqn support.

Signed-off-by: David Vernet <[email protected]>
@Byte-Lab
Copy link
Collaborator Author

Sorry, forgot to change the value in sched_ext_ops, 1 moment.

Now that we have full hotplug sequence number support, as well as the
necessary macros in compat.h, let's extend the hotplug selftest to also
validate that the sequence number can be used to detect hotplug events.

Signed-off-by: David Vernet <[email protected]>
@Byte-Lab Byte-Lab force-pushed the hotplug_final_pieces branch from bef0337 to 9fabb99 Compare April 11, 2024 19:58
@Byte-Lab
Copy link
Collaborator Author

This should be good to go

@Byte-Lab Byte-Lab merged commit 71694be into sched_ext Apr 11, 2024
1 check passed
@Byte-Lab Byte-Lab deleted the hotplug_final_pieces branch April 11, 2024 23:42
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants