Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PVC with storageClass: default renders karpenter unable to schedule pods #497

Open
tpaul1611 opened this issue Sep 24, 2024 · 1 comment
Assignees
Labels
area/e2e-testing Issues or PRs related to e2e testing area/storage Issues or PRs related to storage kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@tpaul1611
Copy link

Version

Karpenter Version: v0.5.4

Kubernetes Version: v1.30.3

Expected Behavior

A PersistendVolumeClaim with storageClass: default should work out of the box.

Actual Behavior

As described in the logs, since the storageClass default does not have any availability zone restrictions, karpenter tags the pod with an empty list of required zones, which effectively makes all existing SKUs incompatible.

If the storageClass does not have any zone restrictions, karpenter should not add the topology.disk.csi.azure.com/zone tag at all.

Steps to Reproduce the Problem

Try scheduling any pod with a PVC (storageClass: default, probably any of the AKS preconfigured storage classes) with self-hosted NAP.

Resource Specs and Logs

{"level":"DEBUG","time":"2024-09-24T07:07:33.059Z","logger":"controller.provisioner","message":"adding requirements derived from pod volumes, [{topology.disk.csi.azure.com/zone In []}]","commit":"846ef96","pod":"dependency-track/dependency-track-apiserver-6689598c68-5258j"}
{"level":"DEBUG","time":"2024-09-24T07:07:33.061Z","logger":"controller.provisioner","message":"27 out of 503 instance types were excluded because they would breach limits","commit":"846ef96","nodepool":"weeu-shared-services-spot"}
{"level":"INFO","time":"2024-09-24T07:07:33.063Z","logger":"controller.provisioner","message":"found provisionable pod(s)","commit":"846ef96","pods":"dependency-track/dependency-track-apiserver-6689598c68-5258j","duration":"10.69571ms"}
{"level":"ERROR","time":"2024-09-24T07:07:33.063Z","logger":"controller.provisioner","message":"Could not schedule pod, incompatible with nodepool \"weeu-shared-services-spot\", daemonset overhead={\"cpu\":\"250m\",\"memory\":\"470Mi\",\"pods\":\"7\"}, no instance type satisfied resources {\"cpu\":\"2250m\",\"memory\":\"8662Mi\",\"pods\":\"8\"} and requirements karpenter.azure.com/sku-family In [D F], karpenter.sh/capacity-type In [spot], karpenter.sh/nodepool In [weeu-shared-services-spot], kubernetes.io/arch In [amd64], kubernetes.io/os In [linux], topology.kubernetes.io/zone In [] (no instance type met all requirements)","commit":"846ef96","pod":"dependency-track/dependency-track-apiserver-6689598c68-5258j"}

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@tallaxes tallaxes added kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on. area/storage Issues or PRs related to storage area/e2e-testing Issues or PRs related to e2e testing labels Sep 27, 2024
@tallaxes tallaxes self-assigned this Sep 27, 2024
@tallaxes
Copy link
Collaborator

Reproduced. Likely has to do with the way we represent zones in offerings. Also need to add E2E test suit for storage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/e2e-testing Issues or PRs related to e2e testing area/storage Issues or PRs related to storage kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

2 participants