Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NodeSwap feature supports in kubeadm #2563

Open
pacoxu opened this issue Sep 8, 2021 · 50 comments · Fixed by kubernetes/website#47710
Open

NodeSwap feature supports in kubeadm #2563

pacoxu opened this issue Sep 8, 2021 · 50 comments · Fixed by kubernetes/website#47710
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Milestone

Comments

@pacoxu
Copy link
Member

pacoxu commented Sep 8, 2021

Is this a BUG REPORT or FEATURE REQUEST?

FEATURE REQUEST

/kind feature

Versions

kubeadm version (use kubeadm version):
NodeSwap is alpha in 1.22 and will be beta1 in 1.28(still default disabled).

  • the promotion is postponed due to lack feedback and testing(I think also due to sig-node lack of reviewers)
  • Elana the feature owner is not currently active. So not included in v1.25 plan.
  • v1.28: swap is supported for cgroup v2 only; the NodeSwap feature gate of the kubelet is beta but disabled by default.

What happened?

I tested NodeSwap on my nodes and when I re-install my env, I got error related to swap.

	[ERROR Swap]: running with swap on is not supported. Please disable swap

I think it's time to start planning for Swap enabling support on the kubeadm side.

What you expected to happen?

There should be NodeSwap support in kubeadm init and we can skip the check if the feature gate is enabled.
Or in 1.23, we should skip the prelight check by default as it will be beta.

How to reproduce it (as minimally and precisely as possible)?

swapon and run kubeadm init

Anything else we need to know?

More details in kubernetes/enhancements#2400

/assign

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 8, 2021
@pacoxu
Copy link
Member Author

pacoxu commented Sep 8, 2021

/cc @ehashman

@neolit123
Copy link
Member

neolit123 commented Sep 8, 2021 via email

@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Sep 8, 2021
@neolit123
Copy link
Member

neolit123 commented Sep 8, 2021 via email

@neolit123 neolit123 modified the milestones: v1.23, v1.24 Sep 8, 2021
@pacoxu
Copy link
Member Author

pacoxu commented Sep 8, 2021

Actually, since we support kubelet n-1 skew, it should probably be done one release after beta.

It makes sense. Hence, if it is beta in 1.23, kubeadm may add the support in 1.24+.

For users like me who want to try the alpha feature, does the preflight check of swap-off too harsh? The workaround is to add ignore flag in 1.22.

At least, the check should be removed in 1.23 when it’s beta in my opinions.

@pacoxu
Copy link
Member Author

pacoxu commented Sep 8, 2021

Or we may change the check error to a warning message?

@neolit123
Copy link
Member

neolit123 commented Sep 8, 2021 via email

@neolit123 neolit123 added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Sep 8, 2021
@neolit123 neolit123 modified the milestones: v1.24, v1.23 Sep 8, 2021
@neolit123
Copy link
Member

looks like this is shifted to Beta for 1.24 due to some failures in CI and missing support in runtimes:
https://docs.google.com/document/d/1Ne57gvidMEWXR70OxxnRkYquAoMpt56o75oZtg-OeBg/edit#
(see notes for 26 Oct)

@neolit123 neolit123 modified the milestones: v1.23, v1.24 Oct 26, 2021
@pacoxu
Copy link
Member Author

pacoxu commented Oct 29, 2021

😓

However, changing SwapOn to be a warning, not an error is valid.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 27, 2022
@neolit123
Copy link
Member

/remove-lifecycle stale

kep seems tracked for beta in 1.24:
kubernetes/enhancements#2400

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 27, 2022
@neolit123
Copy link
Member

update: looks like it was dropped from 1.24:
kubernetes/enhancements#2400 (comment)

@neolit123 neolit123 modified the milestones: v1.24, v1.25 Mar 15, 2022
@pacoxu
Copy link
Member Author

pacoxu commented Apr 2, 2022

update: looks like it was dropped from 1.24:

Most PRs are ready early in the v1.24 cycle.
However, the e2e test can pass too late for v1.24. Some related PRs are still in review.
Hope it can be beta in v1.25.

@pacoxu
Copy link
Member Author

pacoxu commented Aug 1, 2022

No update in v1.25 for swap feature as Elana is ooo.

Sergey will take the swap feature in later releases. No update in v1.26 until now.

Sergey added it to v1.27 Plan and I will work on the swap cgroup v2 support part.

@neolit123 neolit123 modified the milestones: v1.25, v1.26 Aug 25, 2022
@neolit123
Copy link
Member

FWIW, this ticket here is tracking the removal of the kubeadm preflight warning when NodeSwap becomes enabled by default.
leaving the decision to @pacoxu whether the kubeadm warning should be updated for 1.30 and yes the wording can always be better, but if 1.30 enables the feature by default we are removing the preflight check entirely, from my understanding.

@pacoxu
Copy link
Member Author

pacoxu commented Feb 8, 2024

What we can do may be to return an error if the node is with cgroup v1 and swap on. Will this be more ambiguous?

@devZer0
Copy link

devZer0 commented Feb 8, 2024

yes, certainly.

but my system which cannot init or join when swap is active is on cgroup v2 (if is see this correctly) and i have also configured containerd appropriately

if i re-enable swap, drain the cluster node, reboot that and re-join, it reproducably hangs at the stage "[kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap..."

root@kube3:~# cat /etc/containerd/config.toml|grep SystemdCgroup
            SystemdCgroup = true
            
root@kube3:~# stat -fc %T /sys/fs/cgroup/
cgroup2fs

# uname -a
Linux kube3 6.1.0-17-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.69-1 (2023-12-30) x86_64 GNU/Linux

# cat /etc/debian_version
12.4

# mount|grep cg
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)

## /usr/sbin/execsnoop-bpfcc -n runc
PCOMM            PID     PPID    RET ARGS
runc             2320    2301      0 /usr/sbin/runc --root /run/containerd/runc/k8s.io --log /run/containerd/io.containerd.runtime.v2.task/k8s.io/959e05c2381225e3672196925b13ddb58531b0f58f6ade9c727
fb96066e711a0/log.json --log-format json --systemd-cgroup create --bundle /run/containerd/io.containerd.runtime.v2.task/k8s.io/959e05c2381225e3672196925b13ddb58531b0f58f6ade9c7
27fb96066e711a0 --pid-file /run/containerd/io.containerd.runtime.v2.task/k8s.io/959e05c2381225e3672196925b13ddb58531b0f58f6ade9c727fb96066e711a0/init.pid 
959e05c2381225e3672196925b13ddb58531b0f58f6ade9c727fb96066e711a0

@iholder101
Copy link

Hey all!

Some clarifications regarding the current status of NodeSwap in k8s:

  • NodeSwap graduates to Beta2 in Kubernetes 1.30.
  • In Beta2, the NodeSwap feature gate is on by default. However:
    • fail-on-swap=false still needs to be provided to kubelet.
    • The default "SwapBehavior" is NoSwap, which means containers do not have swap access.

IOW: in order to run k8s on a swap-enabled node there's a need to provide fail-on-swap=true.
In order to actually give swap access to containers, the SwapBehavior needs to be set to LimitedSwap (which is currently the only swap behavior supported other than NoSwap).

Regarding cgroups:
Only cgroup v2 is supported for swap. cgroup v1 can be used with NoSwap, which explicitly sets swap limit as 0 at the cgroup level, but cannot be used with LimitedSwap (see kubernetes/kubernetes#123738).

IMO it's safe to remove the error and not even replace it with a warning since to actually use swap the admin would need to explicitly change swap behavior, even if fail-on-swap=true is provided to kubelet.

Please let me know if I can provide more information regarding this.

@neolit123
Copy link
Member

IOW: in order to run k8s on a swap-enabled node there's a need to provide fail-on-swap=true.

to avoid further complains from kubeadm users and additional logged tickets, i think we should keep the preflight check until the kubelet config is updated to not fail on swap by default.

is there a plan for that?

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 4, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 3, 2024
@neolit123 neolit123 removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Aug 5, 2024
@neolit123 neolit123 modified the milestones: v1.31, v1.32 Aug 7, 2024
@iholder101
Copy link

iholder101 commented Aug 27, 2024

IOW: in order to run k8s on a swap-enabled node there's a need to provide fail-on-swap=true.

to avoid further complains from kubeadm users and additional logged tickets, i think we should keep the preflight check until the kubelet config is updated to not fail on swap by default.

is there a plan for that?

Hey @neolit123! As written here, the summary is:

  • NodeSwap graduates to Beta2 in Kubernetes 1.30.
  • In Beta2, the NodeSwap feature gate is on by default. However:
    • fail-on-swap=false still needs to be provided to kubelet.
    • The default "SwapBehavior" is NoSwap, which means containers do not have swap access.

So --fail-on-swap=false is still necessary (and that's not going to change until swap GAs), but the default behavior is NoSwap which means swap is inaccessible for k8s workloads by default.

Can we make sure that the installation docs here https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/ are changed so that swap doesn't have to be turned off with kubeadm?

@neolit123
Copy link
Member

@iholder101 would you be able to help us by pr- ing the docs?

@iholder101
Copy link

@iholder101 would you be able to help us by pr- ing the docs?

Yeah I'd love to. I'll get to it shortly.

BTW, is there anything else required besides changing the docs?

@iholder101
Copy link

On a second sight I see that @pacoxu already updated the docs here: kubernetes/website#42820.

@neolit123 @pacoxu So, is there anything missing?

@neolit123
Copy link
Member

leaving this to @pacoxu to answer.
the state of the noswap FG is still confusing to me, so hope we are clear in the docs and the preflight checks about it.

@pacoxu
Copy link
Member Author

pacoxu commented Aug 27, 2024

My update of the website is too general at that time.

Probably we should make it more clear of how to enable swap and use it in kubelet side.

In Beta2, the NodeSwap feature gate is on by default. However:

  • fail-on-swap=false still needs to be provided to kubelet.
  • The default "SwapBehavior" is NoSwap, which means containers do not have swap access.

This should be mentioned or we can link to the kubelet configuration details about swap to somewhere else which explained about the configurations of kubelet, including failOnSwap and SwapBehavior, and even the system reserve support.

@neolit123
Copy link
Member

neolit123 commented Sep 3, 2024

@iholder101 @pacoxu
should have kubernetes/website#47710 closed this k/kubeadm issue or do we need to keep it for longer?

@pacoxu
Copy link
Member Author

pacoxu commented Sep 4, 2024

/reopen
IIUC, we still need to remove the current preflight check warning in the future.

@k8s-ci-robot
Copy link
Contributor

@pacoxu: Reopened this issue.

In response to this:

/reopen
IIUC, we still need to remove the current preflight check warning in the future.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot reopened this Sep 4, 2024
@neolit123 neolit123 modified the milestones: v1.32, v1.33 Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants