New 1.31 release #830

bschimke95 · 2024-11-22T09:05:29Z

Summary

Merges the current main with the release-1.31 commit on the 1.31 branch.

Notable changes

Cilium `socketLB` and `apiserver-proxy` change

We've configured Cilium to talk to the localhost address instead of the kube-proxy provided service to address the issue we've faced with socketLB. We've introduced a new way to determine the localhost address with #775 which provides a smooth upgrade path. No actions are needed.
LoadBalancer change from Cilium to MetalLB

We've changed the load-balancer implementation from Cilium to MetalLB. This will introduce down-time/interruption to these services while the upgrade is happening. It is necessary to follow the specific steps for the upgrade. Not following these steps properly will introduce nasty issues and conflicts!

Feature components version upgrade

The versions for underlying feature implementations have been upgraded.

Containerd Default Path Change

Does not affect existing nodes, new nodes start using the new designated path. Older nodes might need the adjustment by hand to their args files.

At the moment, the "k8sd cluster-recover" displays interactive prompts and text editors that assist the user in updating the dqlite configuration. We need to be able to run the command non-interactively in order to automate the cluster recovery procedure. This change adds a "--non-interactive" flag. If set, we'll no longer show confirmation prompts and we'll assume that the configuration files have already been updated, proceeding with the dqlite recovery.

Co-authored-by: addyess <[email protected]>

--------- Co-authored-by: Yanisa Haley Scherber <[email protected]>

* Automerge every 4-hours any PR with passing tests labeled with 'automerge' * Make sure the bot can approve the PRs too * Update Bot information only if git email currently unset * consistently use private key secret to setup ssh git-remote * Rename secret to BOT_SSH_KEY * Reimagine auto-merge scripts as python

Warnings that k8s service may not work (#657) KU-1475

* Add epa-howto Co-authored-by: Louise K. Schmidtgen <[email protected]> Co-authored-by: Yanisa Haley Scherber <[email protected]>

* Add how-to for capi in place upgrades * Addressing comments * Linting fixes * Update docs/src/capi/howto/in-place-upgrades.md --------- Co-authored-by: Nick Veitch <[email protected]>

* Create more tests on branches and recipes * Apply review comments

…#681)

* Add unit tests for coredns KU-1515

Metrics Server feature lacks unit test this PR implements tests for Metrics Server functionality. KU-1515

--------- Co-authored-by: Adam Dyess <[email protected]>

* Return non-zero exit code in case of errors At the moment, k8s and k8sd return 0 even if the command fails, which is a problem especially when used inside scripts. We'll ensure that a non-zero exit code is returned if the commands fail. * Update the cluster recovery command to use cobra "Run" The cluster recovery command currently uses "RunE" and returns an error in case of failures. To stay consistent with other commands, we'll use "Run" and call env.Exit as part of the command callback instead of returning the errors.

Refactor Certificates Refresh endpoints to flush the response early and restart the services asynchronously

* Update dualstack.md We have determined that /108 is the maximum supported size. Cluster fails to bootstrap with /64 and /96.

* fix: ensure nf_conntrack module loaded for kube-proxy. This patch ensures that the `nf_conntrack` kernel module is loaded before the `kube-proxy` service is started so it can read some necessary conntrack module-related params from procfs. Previously, although the `kube-proxy` service always crashed if the module wasn't loaded, this wasn't that common of an occurrence in practice as there are quite a few ways `nf_conntrack` gets loaded transparently: * Cilium [automatically loads `iptable_nat`](https://github.com/cilium/cilium/blob/63cd391f93b4e2c865268241d384504348672042/pkg/datapath/iptables/iptables.go#L367-L368) after a small startup delay, whose dependency tree includes `nf_conntrack` * starting firewalld/ufw/most other firewall services * setting iptables/nftables rules which imply session tracking By explicitly loading `nf_conntrack` from the `kube-proxy` service wrapper directly, it should ensure the procfs values kube-proxy reads are always present on startup. Signed-off-by: Nashwan Azhari <[email protected]> * ci: install nf_conntrack module in integration test base LXC image. Signed-off-by: Nashwan Azhari <[email protected]> --------- Signed-off-by: Nashwan Azhari <[email protected]>

--------- Co-authored-by: Lucian Petrut <[email protected]>

--------- Co-authored-by: Benjamin Schimke <[email protected]>

We need to properly clean up the containerd path on snap removal. For that, the path needs to be stored in a file. This serves two purposes: 1. The existence of the file indicates that the cluster was already bootstrapped and the containerd directory is not created by some other service. 2. The containerd path is configurable, having this information in a file makes it easy to access even after the k8sd service is already stopped.

--------- Co-authored-by: Benjamin Schimke <[email protected]>

* reformatted annotations table due to formatting issues the annotations table was quite unclear. Edited to make it more readable Co-authored-by: Nick Veitch <[email protected]>

reviewed tutorials and edited them to make them clearer, fix md linter issues and formatting

apiv1 is deprecated for annotations. `apiv1_annotations` is the recommended package now.

We now use the our own Rocks everywhere, so syncing is not required anymore.

Currently, when removing the snap, the /var/run/containerd folder is not properly removed, as it is a folder. This fixes this issue. Additionally removes other containerd-related folders: /etc/containerd and /var/lib/containerd. We're also removing /opt/cni/bin on snap removal, which is created when bootstrapping the node. As we're removing the k8s snap, we no longer need this folder either.

* update content for 1.31 * bump install command

ktsakalozos

LGTM +1

bschimke95 · 2024-11-22T14:01:03Z

Errors are due to CI issues, the test passes locally. Merging...

addyess and others added 30 commits September 13, 2024 18:35

Auto-update components in release-1.31 branch (#668)

de53534

use lxd 5.21/stable snap (#670)

72808cd

Add unit tests for local storage (#665)

522a161

[main] Update component versions (#674)

407f739

Co-authored-by: addyess <[email protected]>

Add epa explanation docs (#595)

4d5f406

--------- Co-authored-by: Yanisa Haley Scherber <[email protected]>

Update the issue template for creating release branches (#677)

22f04c6

Warnings that k8s service may not work (#657)

969662f

Warnings that k8s service may not work (#657) KU-1475

Epa howto (#658)

1a993cc

* Add epa-howto Co-authored-by: Louise K. Schmidtgen <[email protected]> Co-authored-by: Yanisa Haley Scherber <[email protected]>

Add IPv6-only support for moonray (#664)

e8475bf

Correct microcluster schema migration order (#676)

87eb341

Add how-to for capi in place upgrades (#671)

6ee8863

* Add how-to for capi in place upgrades * Addressing comments * Linting fixes * Update docs/src/capi/howto/in-place-upgrades.md --------- Co-authored-by: Nick Veitch <[email protected]>

let all integration test run (#682)

de6fb4f

Create more tests on branches and recipes (#679)

e4dadd1

* Create more tests on branches and recipes * Apply review comments

Do not stop Kubernetes services on node removal if annotation is set. (…

27c91c8

…#681)

Add unit tests for coredns (#684)

cfa7f99

* Add unit tests for coredns KU-1515

Add certificate expiry endpoint (#683)

d189816

Skip Go/K8s test suite when docs are changed (#685)

5fe3c27

Ignore part of cluster check (#688)

6b15893

Unit tests for Metrics Server k8sd feature (#691)

02f369b

Metrics Server feature lacks unit test this PR implements tests for Metrics Server functionality. KU-1515

Add version upgrade tests (#678)

6ce90fa

--------- Co-authored-by: Adam Dyess <[email protected]>

Use map of struct instead of bool (#693)

a2c00fb

Restoring the Microcluster Schema Migration History (#689)

c74b9f5

Add IPv6 unittests for cluster setup (#698)

d3c4a36

Point cilium to talk to the local apiserver or apiserver-proxy (#697)

9e0a43c

fix unittests after rebase (#703)

1a7bafd

Add CAPI endpoints for Certificates Refresh (#699)

2b9d49a

Refactor Certificates Refresh endpoints to flush the response early and restart the services asynchronously

Update dualstack.md (#706)

4215adc

* Update dualstack.md We have determined that /108 is the maximum supported size. Cluster fails to bootstrap with /64 and /96.

eaudetcobello and others added 19 commits November 15, 2024 17:00

Revert "Fix formatting and update documentation after merge" (#808)

47810ae

revert spelling change (#809)

eb63322

Add registry mirrors, preload snapd and core20 (#799)

19b9957

--------- Co-authored-by: Lucian Petrut <[email protected]>

Update Cilium to 1.16.3 (#803)

00ea902

--------- Co-authored-by: Benjamin Schimke <[email protected]>

Update metrics-server to 0.7.2 and chart to 3.12.2 (#804)

c17d3cf

Update metallb version to 0.14.8 (#805)

db581ac

Update cilium version in sync-images.yaml (#812)

00236be

Update coredns to 1.11.3 and coredns chart to 1.36.0 (#806)

8d20f34

--------- Co-authored-by: Benjamin Schimke <[email protected]>

KU-2068 reformatted annotations table (#789)

aa6c32a

* reformatted annotations table due to formatting issues the annotations table was quite unclear. Edited to make it more readable Co-authored-by: Nick Veitch <[email protected]>

Add permission token at topLevel in workflows (#816)

7314175

tutorials review (#814)

dd2cd3b

reviewed tutorials and edited them to make them clearer, fix md linter issues and formatting

Use new annotation path (#821)

2f1c021

apiv1 is deprecated for annotations. `apiv1_annotations` is the recommended package now.

Only log worker marker file error if exists (#820)

bd42a49

Remove obsolete sync-images scripts (#818)

29feb03

We now use the our own Rocks everywhere, so syncing is not required anymore.

Add security.md with policy (#822)

c648d82

[main] Update component versions (#825)

7f732c9

bschimke95 requested a review from a team as a code owner November 22, 2024 09:05

addyess and others added 3 commits November 22, 2024 10:20

Release 1.31 (#667)

5ff2498

create working docs for 1.31 (#800)

8331f29

Docs refresh 1.31 (#811)

06c3294

* update content for 1.31 * bump install command

bschimke95 force-pushed the new-1.31-release branch from 656437b to 06c3294 Compare November 22, 2024 10:37

bschimke95 added 2 commits November 22, 2024 12:03

fix bootstrap docs diff

e5be6e8

Merge branch 'release-1.31' into new-1.31-release

8f2b082

ktsakalozos approved these changes Nov 22, 2024

View reviewed changes

bschimke95 merged commit 2d7e190 into release-1.31 Nov 22, 2024
15 of 18 checks passed

bschimke95 deleted the new-1.31-release branch November 22, 2024 14:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New 1.31 release #830

New 1.31 release #830

bschimke95 commented Nov 22, 2024

ktsakalozos left a comment

bschimke95 commented Nov 22, 2024

New 1.31 release #830

New 1.31 release #830

Conversation

bschimke95 commented Nov 22, 2024

Summary

Notable changes

Cilium socketLB and apiserver-proxy change

Feature components version upgrade

Containerd Default Path Change

ktsakalozos left a comment

Choose a reason for hiding this comment

bschimke95 commented Nov 22, 2024

Cilium `socketLB` and `apiserver-proxy` change