-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network binding plugin: Support compute container resource overhead #303
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -224,6 +224,56 @@ pod. | |
- Mount points | ||
- Downward API | ||
|
||
##### Compute Container Resource Overhead | ||
|
||
For some plugins, an additional resource consumption can be expected from the virt-launcher pod compute container. | ||
For example, there could be need to execute an additional binary in the compute container. | ||
Since this binary has its own CPU and memory limits, they should be somehow accounted for. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think, we should speak of a specific functionality and not about arbitrary binaries. I would say that if we want an API for this it should be based on a functionality, not a binding. Each resource consumption of it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The plugin can perform actions that affect KubeVirt in ways it cannot account for. |
||
Another example could be increased resource consumption of the virt-stack resulting from using the plugin. | ||
|
||
Suggested solution: | ||
|
||
Additional API for compute container resource overhead: | ||
|
||
The network binding plugin API in the KubeVirt CR could receive an additional input field to specify the resource requirements overhead for the compute container: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a need to explicitly specify if this overhead is added per plugin type or per plugin usage count (i.e. per the number of interfaces referencing it from the VM). It is also important to specify if it is dependent on any other field/setup. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
|
||
```yaml | ||
apiVersion: kubevirt.io/v1 | ||
kind: KubeVirt | ||
metadata: | ||
name: kubevirt | ||
namespace: kubevirt | ||
spec: | ||
configuration: | ||
network: | ||
binding: | ||
mynetbindingplugin: | ||
sidecarImage: quay.io/kubevirt/mynetbindingplugin | ||
computeResourceOverhead: | ||
requests: | ||
memory: 200Mi | ||
``` | ||
|
||
If specified, KubeVirt will add the resource overhead specified in `computeResourceOverhead` to the compute container of the virt-launcher pod. | ||
The specified resource overhead will be added to the compute container per unique usage of the plugin (not per NIC using the binding). | ||
|
||
For example, assuming there is a plugin registered with a 200Mi memory overhead for the `compute` container, and | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does passt require mempry overhead per interface or based on whether its being executed or not? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is enough for the interface to be defined for the passt binary to run and consume the memory specified. It is not clear what you mean by |
||
there are two interfaces using it. | ||
`virt-controller` will only add 200Mi of memory to the `compute` container. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where is this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It was a typo. |
||
|
||
Pros: | ||
- Cluster-wide definition of compute resource overhead per plugin. | ||
- Additional resources could be requested other than CPU and memory. | ||
- The resource overhead specification is visible to cluster admins. | ||
|
||
Cons: | ||
orelmisan marked this conversation as resolved.
Show resolved
Hide resolved
orelmisan marked this conversation as resolved.
Show resolved
Hide resolved
|
||
- Requires an API change. | ||
- When upgrading KubeVirt / network binding plugin versions, the compute resource overhea specification might require adjustments. | ||
|
||
This solution was selected since it provides the cluster admin more control in regard to resource allocation. | ||
|
||
For the alternative solutions please see [Appendix G](#appendix-g-alternatives-to-compute-container-resource-overhead-specification) | ||
|
||
#### Configure Pod netns | ||
|
||
The CNI plugin has privileged access to the pod network namespace and | ||
|
@@ -1162,4 +1212,68 @@ Libvirt configuration snippet: | |
</interface> | ||
``` | ||
|
||
- The domain itself is created by the `virt-launcher`. | ||
- The domain itself is created by the `virt-launcher`. | ||
|
||
# Appendix G: Alternatives to compute container resource overhead specification | ||
|
||
1. Manually setting the VM's resources: | ||
|
||
Users could manually provide the additional memory overhead for the network binding plugin, under `spec.domain.resources.requests.memory`: | ||
|
||
```yaml | ||
apiVersion: kubevirt.io/v1 | ||
kind: VirtualMachine | ||
metadata: | ||
name: vm-cirros | ||
spec: | ||
template: | ||
spec: | ||
domain: | ||
memory: | ||
guest: 128Mi | ||
resources: | ||
requests: | ||
memory: 640Mi # 128Mi for the guest + 512Mi for the network binding plugin | ||
``` | ||
|
||
KubeVirt will create a virt-launcher pod's compute container with a memory request containing the following: | ||
- Guest's memory | ||
- Memory overhead for KubeVirt's components (calculated by virt-controller) | ||
- Memory overhead for the network binding plugin | ||
|
||
Pros: | ||
- Already implemented. | ||
|
||
Cons: | ||
- Error prune | ||
- Defined per VM and not cluster-wide. | ||
- Exposes the VM owner to unnecessary implementation details | ||
|
||
2. Mutating webhook | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it another webook in addition to one that address the sidecar resources one? It seem that it lacks description about what it"ll do, pros, cons section and how its going to integrate with kubevirt? ( I recall something about adding some label to launcher pods that use network binding plugins) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've added additional details, same as for the sidecar. |
||
|
||
For each unique network binding plugin used, the VMI controller will add a label on the virt-launcher pod with the following format: | ||
|
||
`kubevirt.io/network-binding-plugin:<plugin-name>` | ||
|
||
The binding plugin authors will provide a [mutating webhook](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#mutatingadmissionwebhook) that will intercept | ||
virt-launcher pod creation that have the above label, and add the appropriate resources requests/limits | ||
for the pod's `compute` container. | ||
|
||
The mutating webhook will be able to identify the plugin's compute container by its name (`compute`) or using the value of | ||
the `kubectl.kubernetes.io/default-container` annotation. | ||
|
||
Pros: | ||
- Plugin authors have full control over the compute resources | ||
- Additional API is not added to KubeVirt. | ||
- Opens the door for additional changes to the virt-launcher pod without changes to KubeVirt. | ||
- Code changes in KubeVirt are very small. | ||
|
||
Cons: | ||
- Plugin authors should provide another component and integrate it. | ||
- Additional point of failure. | ||
- Requires maintaining certificates for the webhook. | ||
- Additional latency when creating VMs with network binding plugins. | ||
- The additional resource specification is less visible to cluster admins. | ||
- Resource specification could collide with the support container resources specified on the KubeVirt CR or other webhooks. | ||
|
||
The requirement to maintain certificates for the webhook could be mitigated using tools such as [cert-manager](https://cert-manager.io/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please explain what binary can be executed and how it gets there?
How can an external plugin rely on the fact that a specific binary is part of the virt-launcher image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example was derived from the
passt
use case, on which thepasst
binary is shipped as part of the virt-launchercompute
image.Another example could be that the plugin configures the domain in a way that causes the virt-stack to consume additional compute resources that KubeVirt cannot account for.