Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trigger reconciling NetworkPolicy rules upon CNI Add #222

Merged
merged 1 commit into from
Dec 18, 2019

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented Dec 13, 2019

Currently only Pods that have IPs are included in AppliedToGroups. It
makes a Pod's NetworkPolicies not enforced until it has IP assigned in
kube-apiserver, which always introduces a time slot that the Pod is not
isolated by NetworkPolicies created before it.

To enforce NetworkPolicies to Pods, antrea-agent doesn't need PodIPs
from kube-apiserver but the OFPort and IP from internal InterfaceCache.
This patch uses a channel to receive Pod update events from CNIServer
and notify NetworkPolicyController to reconcile rules related to the
updated Pods.

Fixes #197

@antrea-bot
Copy link
Collaborator

Thanks for your PR.
Unit tests and code linters are run automatically every time the PR is updated.
E2e tests can only be triggered by a member of the vmware-tanzu organization. Regular contributors to the project should join the org.

The following commands are available:

  • /test-e2e: to trigger e2e tests. This command can only be run by members of the vmware-tanzu organization
  • /skip-e2e: to skip e2e tests. This command can only be run by members of the vmware-tanzu organization

@tnqn
Copy link
Member Author

tnqn commented Dec 13, 2019

/test-e2e

@tnqn tnqn force-pushed the egress-enhance branch 2 times, most recently from 24f9f44 to f2a6c5e Compare December 13, 2019 07:32
@tnqn
Copy link
Member Author

tnqn commented Dec 13, 2019

/test-e2e

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a couple questions. I think it would be good to add to this PR description which assumptions (if any) we make regarding ordering of events. We can add this information later in developer documentation.

for group, podSet := range c.podSetByGroup {
_, exists := podSet[pod]
if exists {
c.onAppliedToGroupUpdate(group)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a stupid question. Are we assuming that by the time the CNI server notifies us the channel and we execute this code, we have already received the created / updated relevant AppliedToGroups from the Antrea NP controller? If yes, it may be worth documenting in the comment for this function (and explaining why that is the case). If no, then I think I would benefit from an explanation, because in my mind the sequence of events is as follows:

  1. Antrea NP controller is notified by apiserver that a new Pod is being created. The IP address is not known yet, but we update / create AppliedToGroups as needed based on existing Network Policies.
  2. Agent NP controller is notified by Antrea NP controller, flows cannot be installed yet because IP address is not known.
  3. CNIAdd is invoked, Pod networking is configured, IP address is determined. CNI server can notify Agent NP controller using the podUpdates channel.
  4. processPodUpdates processes the notification from the CNI server. The IP address is now retrievable from the interface store, and we can install NP flows.

So I guess my question is: is this the correct sequence of events, are we assuming that 2) happens before 3) (and if yes why can we make this assumption)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The NetworkPolicy implementation is all asynchronous from Antrea Controller to Antrea Agent with best-effort. Your understanding of the sequence of events is correct, but we don't assume it must happen, just it should be that case with a great possibility.
It's because the way Kubernetes works: kube-scheduler schedules a Pod on a Node by setting its its NodeName field via API, then kubelet and antrea-controller can receive the Pod update event at almost the same time, and kubelet starts to create containers for the Pod and antrea-controller starts to update the Group info and notify antrea-agent. It's observed that the former is slower than the latter so the PR works in that case. But it's hard to say it's consistent, due to different network condition, antrea-controller workload, kubelet workload, and cluster scale.
This PR is to enhance the best-effort by triggering the realization as early as possible. Even if a NetworkPolicy is created before a Pod, we can't say antrea-controller must receive the Policy from APIServer first as they are two API connections. But this slight difference should make no difference to users as they should reach eventual consistency in a very short time.
I will add comments to explain which situation this processing will help.

@@ -403,6 +406,10 @@ func (s *CNIServer) CmdAdd(ctx context.Context, request *cnipb.CniCmdRequest) (
klog.Errorf("Failed to configure container %s interface: %v", cniConfig.ContainerId, err)
return s.configInterfaceFailureResponse(err), nil
}

// Notify the Pod update event to required components.
s.podUpdates <- v1beta1.PodReference{Name: podName, Namespace: podNamespace}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so as a follow-up to your comment there: #197 (comment)

we are assuming that it takes less time to 1) notify on the channel + reconcile rules + install flows than it takes for 2) kubelet to receive the response from the CNI and create the workload container

If 1) is very fast, wouldn't it be acceptable to just block until we know it has completed before returning from CNI Add? This way we would have a guarantee that known network policies are enforced before the workload container is created. I know that we don't have a notification mechanism to determine when flows have been installed, but that's something we may have in the future. The only issue I see is we need to know the OF port for NP flows, which means we cannot start installing flows until we have added the new port to the vSwitch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even we block there there is no guarantee Controller already computes and pushes all policies applied to the Pod. As most event processing is already async, I prefer to keep the async behaviors here too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree @jianjuns. Also, as I explained in #222 (comment), it's not even guaranteed antrea-controller will receive the network policy and handle it first unless we sort events of different resources which makes it more complicated. I think the most important thing is whether it's worth to do it if it makes no difference to users.

@@ -403,6 +406,10 @@ func (s *CNIServer) CmdAdd(ctx context.Context, request *cnipb.CniCmdRequest) (
klog.Errorf("Failed to configure container %s interface: %v", cniConfig.ContainerId, err)
return s.configInterfaceFailureResponse(err), nil
}

// Notify the Pod update event to required components.
s.podUpdates <- v1beta1.PodReference{Name: podName, Namespace: podNamespace}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even we block there there is no guarantee Controller already computes and pushes all policies applied to the Pod. As most event processing is already async, I prefer to keep the async behaviors here too.

@@ -1080,8 +1080,8 @@ func (n *NetworkPolicyController) syncAppliedToGroup(key string) error {
// Retrieve all Pods matching the podSelector.
pods, err = n.podLister.Pods(appliedToGroup.Selector.Namespace).List(selector)
for _, pod := range pods {
if pod.Status.PodIP == "" {
// No need to process Pod when IPAddress is unset.
if pod.Spec.NodeName == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If typically Controller always receives a Pod event without NodeName (and IP), maybe we should optimize to ignore the event in updatePod()?

I do not mean we need to make the change in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, I would create an issue to track the optimization as there could be more cases: if nodeName is not set, no need to trigger appliedToGroup; if PodIP is not set, no need to trigger addressGroup.

@tnqn
Copy link
Member Author

tnqn commented Dec 16, 2019

/test-e2e

Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Added one minor comment and one question.

@tnqn
Copy link
Member Author

tnqn commented Dec 17, 2019

/test-e2e

jianjuns
jianjuns previously approved these changes Dec 17, 2019
Currently only Pods that have IPs are included in AppliedToGroups. It
makes a Pod's NetworkPolicies not enforced until it has IP assigned in
kube-apiserver, which always introduces a time slot that the Pod is not
isolated by NetworkPolicies created before it.

To enforce NetworkPolicies to Pods, antrea-agent doesn't need PodIPs
from kube-apiserver but the OFPort and IP from internal InterfaceCache.
This patch uses a channel to receive Pod update events from CNIServer
and notify NetworkPolicyController to reconcile rules related to the
updated Pods.
@tnqn
Copy link
Member Author

tnqn commented Dec 18, 2019

/test-e2e

@tnqn tnqn merged commit 791e2ba into antrea-io:master Dec 18, 2019
@tnqn tnqn deleted the egress-enhance branch December 18, 2019 05:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Egress NetworkPolicy are not enforced fast enough
5 participants