-
Notifications
You must be signed in to change notification settings - Fork 376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add --random-fully to SNAT IP Table Rules #6544
Comments
I personally support adding this as a configuration parameter for the antrea-agent, as I don't think there is any drawback (we would keep the current behavior as the default) and it could potentially help in some situations. @Scoobed for your specific issue, there may be better ways to address it on the client side (we discussed this on Slack). By the way, this may be a "good first issue" if you (or someone else) are interested in taking a stab at it. The only thing to look out for AFAIK is that some older kernel versions may not support |
While this is not mentioned explicitly in the original post, I believe that @Scoobed experienced the issue in the context of Egress SNAT, given that the Slack post mentioned "SNAT IP per namespace". However, the ability for users to configure Node SNAT (masquerade): antrea/pkg/agent/route/route_linux.go Lines 979 to 987 in e921a6e
Egress SNAT: antrea/pkg/agent/route/route_linux.go Lines 969 to 978 in e921a6e
It is not completely clear to me whether we would want 2 separate configuration parameters or a unified one. I would personally prefer the latter. |
I would prefer two settings if possible as we might want the one but not the other one. if I was going to do the pull request is there directions on how this tested and thos requirements |
Do you mean that you would want source port selection to be random for Egress SNAT but not for Node SNAT (or the reverse)? Could you provide more information as to why? |
I mean I would just like to control them independent of each other. Like only enable with Egress and not the node. Sorry if that came off confusing as I don't have a good reason other than flexibility. |
We expose 2 new configuration parameters for the Agent: snatFullyRandomPorts and egress.snatFullyRandomPorts. When the first one is set to true, the MASQUERADE iptables rules used to implement "local" Node SNAT will use randomized source port mappings. When the second one is set to true, the SNAT iptables rules used to implement Egress SNAT will use randomized source port mappings. This is achieved with the `--random-fully` iptables flag. These new configuration parameters are only supported on Linux Nodes. They require iptables >= 1.6.2, which is not a problem with the standard antrea-agent container images. They also require Linux kernel >= 3.13 and >= 3.14 respectively, but these are very old releases so we can ignore this requirment, which is consistent with the K8s implementation. The 2 parameters are set to false by default, but we may change the default value to true in the future. Fixes antrea-io#6544 Signed-off-by: Antonin Bas <[email protected]>
We expose 2 new configuration parameters for the Agent: snatFullyRandomPorts and egress.snatFullyRandomPorts. When the first one is set to true, the MASQUERADE iptables rules used to implement "local" Node SNAT will use randomized source port mappings. When the second one is set to true, the SNAT iptables rules used to implement Egress SNAT will use randomized source port mappings. This is achieved with the `--random-fully` iptables flag. These new configuration parameters are only supported on Linux Nodes. They require iptables >= 1.6.2, which is not a problem with the standard antrea-agent container images. They also require Linux kernel >= 3.13 and >= 3.14 respectively, but these are very old releases so we can ignore this requirment, which is consistent with the K8s implementation. The 2 parameters are set to false by default, but we may change the default value to true in the future. Fixes antrea-io#6544 Signed-off-by: Antonin Bas <[email protected]>
We expose 2 new configuration parameters for the Agent: snatFullyRandomPorts and egress.snatFullyRandomPorts. When the first one is set to true, the MASQUERADE iptables rules used to implement "local" Node SNAT will use randomized source port mappings. When the second one is set to true, the SNAT iptables rules used to implement Egress SNAT will use randomized source port mappings. This is achieved with the `--random-fully` iptables flag. These new configuration parameters are only supported on Linux Nodes. They require iptables >= 1.6.2, which is not a problem with the standard antrea-agent container images. They also require Linux kernel >= 3.13 and >= 3.14 respectively, but these are very old releases so we can ignore this requirment, which is consistent with the K8s implementation. snatFullyRandomPorts is set to false by default (this may change in the future). egress.snatFullyRandomPorts is set to null (empty) by default, which means that unless the parameter is explicitly set, we will use the top-level snatFullyRandomPorts value. Fixes antrea-io#6544 Signed-off-by: Antonin Bas <[email protected]>
Describe the problem/challenge you have
Default SNAT behavior is to preserve the source whenever possible. We are seeing a precentage of traffic using SNATs traffic fail due to TCP Port Numbers being reused too quickly. As the referenced in a-reason-for-unexplained-connection-timeouts-on-kubernetes-docker below where there is latency in the insert into the conntrack table which can cause the same Source Port to be given to multiple pods.
Thus we are seeing in the TCP Dump output, there are multiple [TCP Port Numbers reused] on POD.
Describe the solution you'd like
Expose an option to randomize the Source Pods Ports to use --random / --random-fully in iptables
Anything else you would like to add?
See the Netfilter NAT & Conntrack kernel modules section on the following link.
Note the following link is Kubernetes / Docker, but looks like it would hold true with Kubernetes/Containerd level
a-reason-for-unexplained-connection-timeouts-on-kubernetes-docker
time-wait article
Cilium SNAT issue
The text was updated successfully, but these errors were encountered: