Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workloads that rely on workload identity don’t function on nodes created by Karpenter #563

Open
nfsouzaj opened this issue Nov 4, 2024 · 1 comment

Comments

@nfsouzaj
Copy link

nfsouzaj commented Nov 4, 2024

Version

Karpenter Version: v0.0.0

Kubernetes Version: v1.29.7

Hi, your request to get the app_version (helm ls -A --all -o json | jq '.[] | select(.name=="karpenter") | .app_version' -r) returns nothing as the field is empty. Following the two karpenter ocurrances:
{
"name": "aks-managed-karpenter-overlay",
"namespace": "kube-system",
"revision": "1950",
"updated": "2024-11-04 17:26:31.707192908 +0000 UTC",
"status": "deployed",
"chart": "karpenter-overlay-addon-0.1.0-f334e0b6b9b9c329f88ac4f4578acedf1d519021",
"app_version": ""
},
{
"name": "aks-managed-karpenter-overlay-base",
"namespace": "kube-system",
"revision": "1951",
"updated": "2024-11-04 17:25:58.10704853 +0000 UTC",
"status": "deployed",
"chart": "karpenter-overlay-base-addon-0.1.0-53202eafdc89edaceb7b54487c6b40a51d91e65e",
"app_version": ""
},

Expected Behavior

Pods that rely on workload identity to communicate with Azure PaaS services such as AKV and DNS Zones have to work properly.

Actual Behavior

We use workload identity to enable External Secrets to communicate with Azure Key Vault and External DNS to connect with DNS Zones. After migrating one of my environments to use Karpenter-provisioned nodes, I encountered issues: External Secrets could no longer connect to AKV, and External DNS couldn’t reach the DNS zone.

After hours of troubleshooting, I suspected that Karpenter might be related to the issue. I switched back to regular nodes, and everything immediately started working again. Same user identity, same cluster, just a regular node provisioned by the cloud provider.

I discovered that nodes created without Karpenter have a specific label injected: kubernetes.azure.com/kubelet-identity-client-id, while Karpenter nodes lack this label.

Steps to Reproduce the Problem

  • Enable karpenter.
  • Deploy a workload that triggers the creation of a node.
  • Deploy pods that rely on workload identity on the new node.

Resource Specs and Logs

I am getting the following error:
time="2024-11-04T17:08:12Z" level=fatal msg="Failed to do run once: WorkloadIdentityCredential: unable to resolve an endpoint: server response error:\n context deadline exceeded"

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@logicfox
Copy link

@nfsouzaj The error message you posted seems to be similar to issues I'm having with Karpenter on clusters with non-default cluster DNS IP. Are you using byo-vnets or changing the cluster DNS IP to anything other than 10.0.0.10?

The AKS node bootstrapper is hardcoded to use that IP. There are open issues #335 and #561. The symptom is there is no DNS resolution from containers running on these nodes as the cluster DNS is configured incorrectly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants