Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Cilium Network Policies for aws-loadbalancer-controller-app and aws-pod-identity-webhook #3804

Closed
2 tasks
T-Kukawka opened this issue Dec 17, 2024 · 2 comments
Assignees
Labels
team/phoenix Team Phoenix

Comments

@T-Kukawka
Copy link
Contributor

T-Kukawka commented Dec 17, 2024

In case of customers wanting to improve cluster security, the Cilium Cluster Wide Policy, blocking the IMDSv2 access can be added.

The policy itself can look like:

apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
  name: deny-egress-to-imds
spec:
  endpointSelector: {}
  egressDeny:
    - toCIDR:
        - "169.254.0.0/16"

However there are still two types of workloads on GS side that require adjustments, such that the cluster wide policy can be applied without exceptions:

  1. aws-pod-identity-webhook-app:
│ W1205 12:39:01.575294       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.                                                                                                                                                                                                    
│ F1205 12:39:07.827554       1 main.go:130] Error getting instance identity document: EC2MetadataRequestError: failed to get EC2 instance identity document                                                                                                                                                                                                
│ caused by: RequestError: send request failed caused by: Get "http://169.254.169.254/latest/dynamic/instance-identity/document": context deadline exceeded (Client.Timeout exceeded while awaiting headers)                                                                                                                                                                                             │
  1. aws-loadbalancer-controller-app:
{"level":"info","ts":"2024-12-05T14:12:44Z","msg":"version","GitVersion":"v2.8.3","GitCommit":"a1418f94a060043cacb43cc8f0aeb7f4ac1eb94d","BuildDate":"2024-09-17T05:44:37+0000"}
{"level":"error","ts":"2024-12-05T14:12:57Z","logger":"setup","msg":"unable to initialize AWS cloud","error":"failed to introspect vpcID from EC2Metadata or Node name, specify --aws-vpc-id instead if EC2Metadata is unavailable: failed to fetch VPC ID from instance metadata: RequestError: send request failed\ncaused by: Get \"http://169.254.169.254/latest/meta-data/mac\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
Stream closed EOF for wario130-aws-load-balancer-controller/aws-load-balancer-controller-5c5d667c49-7n2ld (aws-load-balancer-controller)

We should cater and add explicitly Cilium Network Policies such that the Cluster Wide Policy does not affect mentioned workloads.

Acceptance criteria:

  • add Cilium Network Policy explicitly allowing access to EC2Metadata endpoint for aws-pod-identity-webhook-app
  • add Cilium Network Policy explicitly allowing access to EC2Metadata endpoint for aws-loadbalancer-controller-app

Related issues:

@paurosello
Copy link

Update on the current situation:

Both apps have a network policy defined and in the clusters right now that allow connectivity to the metadata endpoint:

The problem here is that the CiliumClusterwideNetworkPolicy EgressDenyRule will deny all traffic even if allowed by any other network policy as per documentation:

EgressDeny is a list of EgressDenyRule which are enforced at egress. Any rule inserted here will by denied regardless of the allowed egress rules in the 'egress' field. If omitted or empty, this rule does not apply at egress.

In 1.16 there is a feature that might help us, but I have not been able to make it work which is EnableDefaultDeny:

EnableDefaultDeny determines whether this policy configures the subject endpoint(s) to have a default deny mode. If enabled, this causes all traffic not explicitly allowed by a network policy to be dropped. If not specified, the default is true for each traffic direction that has rules, and false otherwise. For example, if a policy only has Ingress or IngressDeny rules, then the default for ingress is true and egress is false. If multiple policies apply to an endpoint, that endpoint's default deny will be enabled if any policy requests it. This is useful for creating broad-based network policies that will not cause endpoints to enter default-deny mode.

Meanwhile I thought about simply disabling the credentials endpoint by doing something like the following block:

  - toEndpoints:
    - matchLabels:
        env: prod
    toPorts:
    - ports:
      - port: "80"
        protocol: TCP
      rules:
        http:
        - method: "GET"
          path: "/public"

This should reduce the attack as it would allow it to discover stuff of the node itself but deny any kind of privilege escalation.

@github-project-automation github-project-automation bot moved this from Inbox 📥 to Validation ☑️ in Roadmap Jan 14, 2025
@paurosello
Copy link

The previous example does not work, the egressDeny does not have L7 capability filtering, only the "egress" rule does :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
team/phoenix Team Phoenix
Projects
Archived in project
Development

No branches or pull requests

3 participants