Skip to content

Latest commit

 

History

History
263 lines (209 loc) · 12.3 KB

versioning.md

File metadata and controls

263 lines (209 loc) · 12.3 KB

Antrea Versioning

Table of Contents

Versioning scheme

Antrea versions are expressed as x.y.z, where x is the major version, y is the minor version, and z is the patch version, following Semantic Versioning terminology.

Minor releases and patch releases

Unlike minor releases, patch releases should not contain miscellaneous feature additions or improvements. No incompatibilities should ever be introduced between patch versions of the same minor version. API groups / versions must not be introduced or removed as part of patch releases.

Patch releases are intended for important bug fixes to recent minor versions, such as addressing security vulnerabilities, fixes to problems preventing Antrea from being deployed & used successfully by a significant number of users, severe problems with no workaround, and blockers for products (including commercial products) which rely on Antrea.

When it comes to dependencies, the following rules are observed between patch versions of the same Antrea minor versions:

  • the same minor OVS version should be used
  • the same minor version should be used for all Go dependencies, unless updating to a new minor / major version is required for an important bug fix
  • for Antrea Docker images shipped as part of a patch release, the same version must be used for the base Operating System (Linux distribution / Windows server), unless an update is required to fix a critical bug. If important updates are available for a given Operating System version (e.g. which address security vulnerabilities), they should be included in Antrea patch releases.

Feature stability

For every Antrea minor release, the stability level of supported features may be updated (from Alpha to Beta or from Beta to GA). Refer to the the CHANGELOG for information about feature stability level for each release. For features controlled by a feature gate, this information is also present in a more structured way in feature-gates.md.

Release cycle

New Antrea minor releases are currently shipped every 6 to 8 weeks. This fast release cadence enables us to ship new features quickly and frequently. It may change in the future. Compared to deploying the top-of-tree of the Antrea main branch, using a released version should provide more stability guarantees:

  • despite our CI pipelines, some bugs can sneak into the branch and be fixed shortly after
  • merge conflicts can break the top-of-tree temporarily
  • some CI jobs are run periodically and not for every pull request before merge; as much as possible we run the entire test suite for each release candidate

Antrea maintains release branches for the two most recent minor releases (e.g. the release-0.10 and release-0.11 branches are maintained until Antrea 0.12 is released). As part of this maintenance process, patch versions are released as frequently as needed, following these guidelines. With the current release cadence, this means that each minor release receives approximately 3 months of patch support. This may seem short, but was done on purpose to encourage users to upgrade Antrea often and avoid potential incompatibility issues. In the future, we may reduce our release cadence for minor releases and simultaneously increase the support window for each release.

Antrea upgrade and supported version skew

Our goal is to support "graceful" upgrades for Antrea. By "graceful", we notably mean that there should be no significant disruption to data-plane connectivity nor to policy enforcement, beyond the necessary disruption incurred by the restart of individual components:

  • during the Antrea Controller restart, new policies will not be processed. Because the Controller also runs the validation webhook for Antrea-native policies, an attempt to create an Antrea-native policy resource before the restart is complete may return an error.
  • during an Antrea Agent restart, the Node's data-plane will be impacted: new connections to & from the Node will not be possible, and existing connections may break.

In particular, it should be possible to upgrade Antrea without compromising enforcement of existing network policies for both new and existing Pods.

In order to achieve this, the different Antrea components need to support version skew.

  • Antrea Controller: must be upgraded first
  • Antrea Agent: must not be newer than the Antrea Controller, and may be up to 4 minor versions older
  • Antctl: must not be newer than the Antrea Controller, and may be up to 4 minor versions older

The supported version skew means that we only recommend Antrea upgrades to a new release up to 4 minor versions newer. For example, a cluster using 0.10 can be upgraded to one of 0.11, 0.12, 0.13 or 0.14, but we discourage direct upgrades to 0.15 and beyond. With the current release cadence, this provides a 6-month window of compatibility. If we reduce our release cadence in the future, we may revisit this policy as well.

When directly applying a newer Antrea YAML manifest, as provided for each release, there is no guarantee that the Antrea Controller will be upgraded first. In practice, the Controller would be upgraded simultaneously with the first Agent(s) to be upgraded by the rolling update of the Agent DaemonSet. This may create some transient issues and compromise the "graceful" upgrade. For upgrade scenarios, we therefore recommend that you "split-up" the manifest to ensure that the Controller is upgraded first.

Supported K8s versions

Each Antrea minor release should support maintained K8s releases at the time of release (3 up to K8s 1.19, 4 after that). For example, at the time that Antrea 0.10 was released, the latest K8s version was 1.19; as a result we guarantee that 0.10 supports at least 1.19, 1.18 and 1.17 (in practice it also supports K8s 1.16).

In addition, we strive to support the K8s versions used by default in cloud-managed K8s services (EKS, AKS and GKE regular channel).

Deprecation policies

Prometheus metrics deprecation policy

Antrea follows a similar policy as Kubernetes for metrics deprecation.

Alpha metrics have no stability guarantees; as such they can be modified or deleted at any time.

Stable metrics are guaranteed to not change; specifically, stability means:

  • the metric itself will not be renamed
  • the type of metric will not be modified

Eventually, even a stable metric can be deleted. In this case, the metric must be marked as deprecated first and the metric must stay deprecated for at least one minor release. The CHANGELOG must announce both metric deprecations and metric deletions.

Before deprecation:

# HELP some_counter this counts things
# TYPE some_counter counter
some_counter 0

After deprecation:

# HELP some_counter (Deprecated since 0.10.0) this counts things
# TYPE some_counter counter
some_counter 0

In the future, we may introduce the same concept of hidden metric as K8s, as an additional part of the metric lifecycle.

APIs deprecation policy

The Antrea APIs are built using K8s (they are a combination of CustomResourceDefinitions and aggregation layer APIServices) and we follow the same versioning scheme as the K8s APIs and the same deprecation policy.

Other than the most recent API versions in each track, older API versions must be supported after their announced deprecation for a duration of no less than:

  • GA: 12 months
  • Beta: 9 months
  • Alpha: N/A (can be removed immediately)

This also applies to the controlplane API. In particular, introduction and removal of new versions for this API must respect the "graceful" upgrade guarantee. The controlplane API (which is exposed using the aggregation layer) is often referred to as an "internal" API as it is used by the Antrea components to communicate with each other, and is usually not consumed by end users, e.g. cluster admins. However, this API may also be used for integration with other software, which is why we abide to the same deprecation policy as for other more "user-facing" APIs (e.g. Antrea-native policy CRDs).

K8s has a moratorium on the removal of API object versions that have been persisted to storage. At the moment, none of Antrea APIServices (which use the aggregation layer) persist objects to storage. So the only objects we need to worry about are CustomResources, which are persisted by the K8s apiserver. For them, we adopt the following rules:

  • Alpha API versions may be removed at any time.
  • The deprecated field must be used for CRDs to indicate that a particular version of the resource has been deprecated.
  • Beta and GA API versions must be supported after deprecation for the respective durations stipulated above before they can be removed.
  • For deprecated Beta and GA API versions, a conversion webhook must be provided along with each Antrea release, until the API version is removed altogether.

Introducing new API resources

Introducing new CRDs

Starting with Antrea v1.0, all Custom Resource Definitions (CRDs) for Antrea are defined in the same API group, crd.antrea.io, and all CRDs in this group are versioned individually. For example, at the time of writing this (v1.3 release timeframe), the Antrea CRDs include:

  • ClusterGroup in crd.antrea.io/v1alpha2
  • ClusterGroup in crd.antrea.io/v1alpha3
  • Egress in crd.antrea.io/v1alpha2
  • etc.

Notice how 2 versions of ClusterGroup are supported: the one in crd.antrea.io/v1alpha2 was introduced in v1.0, and is being deprecated as it was replaced by the one in crd.antrea.io/v1alpha3, introduced in v1.1.

When introducing a new version of a CRD, the API deprecation policy should be followed.

When introducing a CRD, the following rule should be followed in order to avoid potential dependency cycles (and thus import cycles in Go): if the CRD depends on other object types spread across potentially different versions of crd.antrea.io, the CRD should be defined in a group version greater or equal to all of these versions. For example, if we want to introduce a new CRD which depends on types v1alpha1.X and v1alpha2.Y, it needs to go into v1alpha2 or a more recent version of crd.antrea.io. As a rule it should probably go into v1alpha2 unless it is closely related to other CRDs in a later version, in which case it can be defined alongside these CRDs, in order to avoid user confusion.

If a new CRD does not have dependencies and is not closely related to an existing CRD, it will typically be defined in v1alpha1. In some rare cases, a CRD can be defined in v1beta1 directly if there is enough confidence in the stability of the API.