-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement several enhancements to NUMA policies.
Add a new "interleave" allocation policy which stripes pages across domains with a stride or width keeping contiguity within a multi-page region. Move the kernel to the dedicated numbered cpuset #2 making it possible to assign kernel threads and memory policy separately from user. This also eliminates the need for the complicated interrupt binding code. Add a sysctl API for viewing and manipulating domainsets. Refactor some of the cpuset_t manipulation code using the generic bitset type so that it can be used for both. This probably belongs in a dedicated subr file. Attempt to improve the include situation. Reviewed by: kib Discussed with: jhb (cpuset parts) Tested by: pho (before review feedback) Sponsored by: Netflix, Dell/EMC Isilon Differential Revision: https://reviews.freebsd.org/D14839
- Loading branch information
jeff
authored and
jeff
committed
Mar 29, 2018
1 parent
9d420f4
commit 5e24432
Showing
14 changed files
with
432 additions
and
177 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -118,6 +118,7 @@ MAN= accept_filter.9 \ | |
disk.9 \ | ||
dnv.9 \ | ||
domain.9 \ | ||
domainset.9 \ | ||
dpcpu.9 \ | ||
drbr.9 \ | ||
driver.9 \ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
.\" Copyright (c) 2018 Jeffrey Roberson <[email protected]> | ||
.\" All rights reserved. | ||
.\" | ||
.\" Redistribution and use in source and binary forms, with or without | ||
.\" modification, are permitted provided that the following conditions | ||
.\" are met: | ||
.\" 1. Redistributions of source code must retain the above copyright | ||
.\" notice, this list of conditions and the following disclaimer. | ||
.\" 2. Redistributions in binary form must reproduce the above copyright | ||
.\" notice, this list of conditions and the following disclaimer in the | ||
.\" documentation and/or other materials provided with the distribution. | ||
.\" | ||
.\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' | ||
.\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED | ||
.\" TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR | ||
.\" PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE | ||
.\" LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR | ||
.\" CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF | ||
.\" SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS | ||
.\" INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN | ||
.\" CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) | ||
.\" ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE | ||
.\" POSSIBILITY OF SUCH DAMAGE. | ||
.\" | ||
.\" $FreeBSD$ | ||
.\" | ||
.Dd March 24, 2018 | ||
.Dt DOMAINSET 9 | ||
.Os | ||
.Sh NAME | ||
.Nm domainset(9) | ||
\(em | ||
.Nm domainset_create , | ||
.Nm sysctl_handle_domainset . | ||
.Nd domainset functions and operation | ||
.Sh SYNOPSIS | ||
.In sys/_domainset.h | ||
.In sys/domainset.h | ||
.\" | ||
.Bd -literal -offset indent | ||
struct domainset { | ||
domainset_t ds_mask; | ||
uint16_t ds_policy; | ||
domainid_t ds_prefer; | ||
... | ||
}; | ||
.Ed | ||
.Pp | ||
.Ft struct domainset * | ||
.Fn domainset_create "const struct domainset *key" | ||
.Ft int | ||
.Fn sysctl_handle_domainset "SYSCTL_HANDLER_ARGS" | ||
.Sh DESCRIPTION | ||
The | ||
.Nm | ||
API provides memory domain allocation policy for NUMA machines. | ||
Each | ||
.Vt domainset | ||
contains a bitmask of allowed domains, an integer policy, and an optional | ||
preferred domain. | ||
Together, these specify a search order for memory allocations as well as | ||
the ability to restrict threads and objects to a subset of available | ||
memory domains for system partitioning and resource management. | ||
.Pp | ||
Every thread in the system and optionally every | ||
.Vt vm_object_t , | ||
which is used to represent files and other memory sources, has | ||
a reference to a | ||
.Vt struct domainset . | ||
The domainset associated with the object is consulted first and the system | ||
falls back to the thread policy if none exists. | ||
.Pp | ||
The allocation policy has the following possible values: | ||
.Bl -tag -width "foo" | ||
.It Dv DOMAINSET_POLICY_ROUNDROBIN | ||
Memory is allocated from each domain in the mask in a round-robin fashion. | ||
This distributes bandwidth evenly among available domains. | ||
This policy can specify a single domain for a fixed allocation. | ||
.It Dv DOMAINSET_POLICY_FIRSTTOUCH | ||
Memory is allocated from the node that it is first accessed on. | ||
Allocation falls back to round-robin if the current domain is not in the | ||
allowed set or is out of memory. | ||
This policy optimizes for locality but may give pessimal results if the | ||
memory is accessed from many CPUs that are not in the local domain. | ||
.It Dv DOMAINSET_POLICY_PREFER | ||
Memory is allocated from the node in the | ||
.Vt prefer | ||
member. The preferred node must be set in the allowed mask. | ||
If the preferred node is out of memory the allocation falls back to | ||
round-robin among allowed sets. | ||
.It Dv DOMAINSET_POLICY_INTERLEAVE | ||
Memory is allocated in a striped fashion with multiple pages | ||
allocated to each domain in the set according to the offset within | ||
the object. | ||
The strip width is object dependent and may be as large as a | ||
super-page (2MB on amd64). | ||
This gives good distribution among memory domains while keeping system | ||
efficiency higher and is preferential to round-robin for general use. | ||
.El | ||
.Pp | ||
The | ||
.Fn domainset_create | ||
function takes a partially filled in domainset as a key and returns a | ||
valid domainset or NULL. | ||
It is critical that consumers not use domainsets that have not been | ||
returned by this function. | ||
.Vt | ||
domainset | ||
is an immutable type that is shared among all matching keys and must | ||
not be modified after return. | ||
.Pp | ||
The | ||
.Fn sysctl_handle_domainset | ||
function is provided as a convenience for modifying or viewing domainsets | ||
that are not accessible via | ||
.Xr cpuset 2 . | ||
It is intended for use with | ||
.Xr sysctl 9 . | ||
.Pp | ||
.Sh SEE ALSO | ||
.Xr cpuset 1 , | ||
.Xr cpuset 2 , | ||
.Xr cpuset_setdomain 2 , | ||
.Xr bitset 9 | ||
.Sh HISTORY | ||
.In sys/domainset.h | ||
first appeared in | ||
.Fx 12.0 . |
Oops, something went wrong.