Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New 1.31 release #830

Merged
merged 127 commits into from
Nov 22, 2024
Merged

New 1.31 release #830

merged 127 commits into from
Nov 22, 2024

Conversation

bschimke95
Copy link
Contributor

Summary

Merges the current main with the release-1.31 commit on the 1.31 branch.

Notable changes

Cilium socketLB and apiserver-proxy change

We've configured Cilium to talk to the localhost address instead of the kube-proxy provided service to address the issue we've faced with socketLB. We've introduced a new way to determine the localhost address with #775 which provides a smooth upgrade path. No actions are needed.
LoadBalancer change from Cilium to MetalLB

We've changed the load-balancer implementation from Cilium to MetalLB. This will introduce down-time/interruption to these services while the upgrade is happening. It is necessary to follow the specific steps for the upgrade. Not following these steps properly will introduce nasty issues and conflicts!

Feature components version upgrade

The versions for underlying feature implementations have been upgraded.

Containerd Default Path Change

Does not affect existing nodes, new nodes start using the new designated path. Older nodes might need the adjustment by hand to their args files.

addyess and others added 30 commits September 13, 2024 18:35
At the moment, the "k8sd cluster-recover" displays interactive
prompts and text editors that assist the user in updating the dqlite
configuration.

We need to be able to run the command non-interactively in order
to automate the cluster recovery procedure.

This change adds a "--non-interactive" flag. If set, we'll no longer
show confirmation prompts and we'll assume that the configuration
files have already been updated, proceeding with the dqlite recovery.
---------

Co-authored-by: Yanisa Haley Scherber <[email protected]>
* Automerge every 4-hours any PR with passing tests labeled with 'automerge'
* Make sure the bot can approve the PRs too
* Update Bot information only if git email currently unset
* consistently use private key secret to setup ssh git-remote
* Rename secret to BOT_SSH_KEY
* Reimagine auto-merge scripts as python
Warnings that k8s service may not work (#657)
KU-1475
* Add epa-howto

Co-authored-by: Louise K. Schmidtgen <[email protected]>
Co-authored-by: Yanisa Haley Scherber <[email protected]>
* Add how-to for capi in place upgrades

* Addressing comments

* Linting fixes

* Update docs/src/capi/howto/in-place-upgrades.md

---------

Co-authored-by: Nick Veitch <[email protected]>
* Create more tests on branches and recipes

* Apply review comments
* Add unit tests for coredns
KU-1515
Metrics Server feature lacks unit test this PR implements tests for Metrics Server functionality.
KU-1515
---------

Co-authored-by: Adam Dyess <[email protected]>
* Return non-zero exit code in case of errors

At the moment, k8s and k8sd return 0 even if the command fails,
which is a problem especially when used inside scripts.

We'll ensure that a non-zero exit code is returned if the commands
fail.

* Update the cluster recovery command to use cobra "Run"

The cluster recovery command currently uses "RunE" and returns an
error in case of failures.

To stay consistent with other commands, we'll use "Run" and call
env.Exit as part of the command callback instead of returning the
errors.
Refactor Certificates Refresh endpoints to flush the response early and restart the services asynchronously
* Update dualstack.md

We have determined that /108 is the maximum supported size. Cluster fails to bootstrap with /64 and /96.
eaudetcobello and others added 19 commits November 15, 2024 17:00
* fix: ensure nf_conntrack module loaded for kube-proxy.

This patch ensures that the `nf_conntrack` kernel module is loaded
before the `kube-proxy` service is started so it can read some
necessary conntrack module-related params from procfs.

Previously, although the `kube-proxy` service always crashed if the module
wasn't loaded, this wasn't that common of an occurrence in practice as
there are quite a few ways `nf_conntrack` gets loaded transparently:
* Cilium [automatically loads `iptable_nat`](https://github.com/cilium/cilium/blob/63cd391f93b4e2c865268241d384504348672042/pkg/datapath/iptables/iptables.go#L367-L368)
after a small startup delay, whose dependency tree includes `nf_conntrack`
* starting firewalld/ufw/most other firewall services
* setting iptables/nftables rules which imply session tracking

By explicitly loading `nf_conntrack` from the `kube-proxy` service
wrapper directly, it should ensure the procfs values kube-proxy reads
are always present on startup.

Signed-off-by: Nashwan Azhari <[email protected]>

* ci: install nf_conntrack module in integration test base LXC image.

Signed-off-by: Nashwan Azhari <[email protected]>

---------

Signed-off-by: Nashwan Azhari <[email protected]>
---------

Co-authored-by: Benjamin Schimke <[email protected]>
We need to properly clean up the containerd path on snap removal.
For that, the path needs to be stored in a file.
This serves two purposes:
1. The existence of the file indicates that the cluster was already bootstrapped
   and the containerd directory is not created by some other service.
2. The containerd path is configurable, having this information in a file makes it easy
   to access even after the k8sd service is already stopped.
* reformatted annotations table

due to formatting issues the annotations table was quite unclear. Edited to make it more readable

Co-authored-by: Nick Veitch <[email protected]>
reviewed tutorials and edited them to make them clearer, fix md linter issues and formatting
apiv1 is deprecated for annotations.
`apiv1_annotations` is the recommended package now.
We now use the our own Rocks everywhere, so syncing
is not required anymore.
Currently, when removing the snap, the /var/run/containerd folder is not
properly removed, as it is a folder. This fixes this issue.

Additionally removes other containerd-related folders: /etc/containerd
and /var/lib/containerd.

We're also removing /opt/cni/bin on snap removal, which is created when
bootstrapping the node. As we're removing the k8s snap, we no longer
need this folder either.
@bschimke95 bschimke95 requested a review from a team as a code owner November 22, 2024 09:05
Copy link
Member

@ktsakalozos ktsakalozos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

@bschimke95
Copy link
Contributor Author

Errors are due to CI issues, the test passes locally. Merging...

@bschimke95 bschimke95 merged commit 2d7e190 into release-1.31 Nov 22, 2024
15 of 18 checks passed
@bschimke95 bschimke95 deleted the new-1.31-release branch November 22, 2024 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.