-
Notifications
You must be signed in to change notification settings - Fork 375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automated cherry pick of #5739: Store NetworkPolicy in filesystem as fallback data source #5777: Enable Pod network after realizing initial NetworkPolicies #5795: Support Local ExternalTrafficPolicy for Services with #5798: Fix unit test TestReconcile #5833: Enable IPv4/IPv6 forwarding on demand automatically #5860
Automated cherry pick of #5739: Store NetworkPolicy in filesystem as fallback data source #5777: Enable Pod network after realizing initial NetworkPolicies #5795: Support Local ExternalTrafficPolicy for Services with #5798: Fix unit test TestReconcile #5833: Enable IPv4/IPv6 forwarding on demand automatically #5860
Conversation
In the previous implementation, traffic from/to a Pod may bypass NetworkPolicies applied to the Pod in a time window when the agent restarts because realizing NetworkPolicies and enabling forwarding are asynchronous. This patch stores NetworkPolicy data in files when they are received, and makes antre-agent fallback to use the files as data source if it can't connect to antrea-controller on startup. This prevents security regression: a NetworkPolicy that has been realized on a Node will continue to work even if antrea-controller is not available after antrea-agent restarts. The benchmark results of the storage's operations are as below: BenchmarkFileStoreAddNetworkPolicy-40 70383 16102 ns/op 520 B/op 9 allocs/op BenchmarkFileStoreAddAppliedToGroup-40 45382 25880 ns/op 3019 B/op 9 allocs/op BenchmarkFileStoreAddAddressGroup-40 7400 180000 ns/op 49538 B/op 9 allocs/op BenchmarkFileStoreReplaceAll-40 10 127088004 ns/op 17815943 B/op 33099 allocs/op The disk usage when storing 1k NetworkPolicies, AddressGroups, and AppliedToGroups created by BenchmarkFileStoreReplaceAll is as below: 16M /var/run/antrea-test/file-store/address-groups 4.0M /var/run/antrea-test/file-store/applied-to-groups 4.0M /var/run/antrea-test/file-store/network-policies Signed-off-by: Quan Tian <[email protected]>
Pod network should only be enabled after realizing initial NetworkPolicies, otherwise traffic from/to Pods could bypass NetworkPolicy when antrea-agent restarts. After commit f9fc979 ("Store NetworkPolicy in filesystem as fallback data source"), antrea-agent can realize either the latest NetworkPolicies got from antrea-controller or the ones got from filesystem as fallback. Therefore, waiting for NetworkPolicies to be realized should not add marked delay or make antrea-controller a failure point of Pod network. This commit adds an implementation of wait group capable of waiting with a timeout, and uses it to wait for common initialization and NetworkPolicy realization before installing any flows for Pods. More preconditions can be added via the wait group if needed in the future. Signed-off-by: Quan Tian <[email protected]>
Since K8s 1.29, setting Local ExternalTrafficPolicy for ClusterIP Services with ExternalIPs is supported. Signed-off-by: Quan Tian <[email protected]>
cniServer.reconcile() now installs flows asynchorously. Signed-off-by: Quan Tian <[email protected]>
Although it has been documented as a prerequisite in [1], there are some platforms not enabling ip forwarding by default. kube-proxy ipvs mode and some CNIs enable it by themselves to ensure Pod networking work properly. As Antrea needs IP forwarding to be enabled, there seems no reason to not do it by itself, rather than expecting users or other components to do it. [1] https://kubernetes.io/docs/setup/production-environment/container-runtimes/#install-and-configure-prerequisites Signed-off-by: Quan Tian <[email protected]>
42edc54
to
f57e6f7
Compare
/test-all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I feel like it may be better in the future to have separate cherry-picks for unrelated PRs (such as #5833)
sure, will do. |
BTW, I used "rebase and merge" to not mix them in a commit, if it's what you were concerned. |
I was also wondering about release note generation, but I think our prepare-changelog.sh script already handles this case and will extract all individual PRs? |
Yes, it has been taken into consideration. |
Cherry pick of #5739 #5777 #5795 #5798 #5833 on release-1.13.
#5739: Store NetworkPolicy in filesystem as fallback data source
#5777: Enable Pod network after realizing initial NetworkPolicies
#5795: Support Local ExternalTrafficPolicy for Services with
#5798: Fix unit test TestReconcile
#5833: Enable IPv4/IPv6 forwarding on demand automatically
For details on the cherry pick process, see the cherry pick requests page.