Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] No api service on 9001 - crashloop #3727

Open
2 tasks done
matt-tvg opened this issue Oct 31, 2024 · 6 comments
Open
2 tasks done

[Bug] No api service on 9001 - crashloop #3727

matt-tvg opened this issue Oct 31, 2024 · 6 comments
Labels
bug Something isn't working needs-triage

Comments

@matt-tvg
Copy link

matt-tvg commented Oct 31, 2024

Kubecost Helm Chart Version

2.4.1

Kubernetes Version

v1.31.1-eks-ce1d5eb

Kubernetes Platform

EKS

Description

On fresh deploymentscost-analyzer-frontend creash loops with the error:

nginx: [emerg] host not found in upstream "cost-analyzer.build-ci-kubecost:9003" in /etc/nginx/conf.d/default.conf:42  

The config map for the nginx config shows the upstream configured as i'd expect:

upstream api {                                                                                                                                               │
    server cost-analyzer.build-ci-kubecost:9001;                                                                                                             │
}

However there doesnt seem to be a service running on 9001:

 kubectl get services -n build-ci-kubecost --sort-by=.metadata.name
NAME                        TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
cost-analyzer               ClusterIP   172.20.104.180   <none>        9003/TCP,9090/TCP   24m
cost-analyzer-aggregator    ClusterIP   172.20.120.179   <none>        9004/TCP            24m
cost-analyzer-cloud-cost    ClusterIP   172.20.193.29    <none>        9005/TCP            24m
cost-analyzer-forecasting   ClusterIP   172.20.152.48    <none>        5000/TCP            24m

Out of curiousity i tried setting it to 9090 via the below:

  useDefaultFqdn: false
  api:
    fqdn: cost-analyzer.build-ci-kubecost:9090

However that gave a similar error:

nginx: [emerg] host not found in upstream "cost-analyzer.build-ci-kubecost:9090" in /etc/nginx/conf.d/default.conf:38

Steps to reproduce

  1. Fresh helm installation against an EKS cluster v1.31

Expected behavior

all pods to become ready and a frontend to be available on 9090

Impact

cannot access

Screenshots

No response

Logs

No response

Slack discussion

No response

Troubleshooting

  • I have read and followed the issue guidelines and this is a bug impacting only the Helm chart.
  • I have searched other issues in this repository and mine is not recorded.
@matt-tvg matt-tvg added bug Something isn't working needs-triage labels Oct 31, 2024
@chipzoller
Copy link
Collaborator

Would you please provide the values you used for installation?

@matt-tvg
Copy link
Author

matt-tvg commented Oct 31, 2024

Hi,

templated values are below :)

global:
  prometheus:
    enabled: false
    fqdn: http://prometheus-server.${prometheus_namespace}.svc
  grafana:
    enabled: false
    proxy: false

pricingCsv:
  enabled: false

nodeSelector:
    node-role: ${node_selector}
tolerations:
    - key: CriticalAddonsOnly
      operator: Equal
      value: "true"
      effect: NoSchedule

affinity: {}

# If true, creates a PriorityClass to be used by the cost-analyzer pod
priority:
  enabled: false
  # value: 1000000

# If true, enable creation of NetworkPolicy resources.
networkPolicy:
  enabled: false

podSecurityPolicy:
  enabled: false

kubecostFrontend:
  image: ${aws_account}.dkr.ecr.${aws_region}.amazonaws.com/ecr-public/kubecost/frontend
  imagePullPolicy: Always
  resources:
    requests:
      cpu: "10m"
      memory: "55Mi"
    #limits:
    #  cpu: "100m"
    #  memory: "256Mi"

forecasting:
  fullImageName: ${aws_account}.dkr.ecr.${aws_region}.amazonaws.com/ecr-public/kubecost/kubecost-modeling:v0.1.16
  imagePullPolicy: Always
  nodeSelector:
    node-role: ${node_selector}
  tolerations:
      - key: CriticalAddonsOnly
        operator: Equal
        value: "true"
        effect: NoSchedule

kubecostModel:
  image: ${aws_account}.dkr.ecr.${aws_region}.amazonaws.com/ecr-public/kubecost/cost-model
  imagePullPolicy: Always
  warmCache: true
  warmSavingsCache: true
  etl: true
  # The total number of days the ETL storage will build
  etlStoreDurationDays: 120
  maxQueryConcurrency: 5
  # utcOffset represents a timezone in hours and minutes east (+) or west (-)
  # of UTC, itself, which is defined as +00:00.
  # See the tz database of timezones to look up your local UTC offset:
  # https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
  utcOffset: "+00:00"
  resources:
    requests:
      cpu: "200m"
      memory: "55Mi"
    #limits:
    #  cpu: "800m"
    #  memory: "256Mi"


# Define persistence volume for cost-analyzer
persistentVolume:
  size: 0.2Gi
  dbSize: 32.0Gi
  enabled: true # Note that setting this to false means configurations will be wiped out on pod restart.

service:
  type: ClusterIP
  port: 9090
  targetPort: 9090
  labels: {}
  annotations: {}

reporting:
  productAnalytics: false

image overrides are purely to make use of pull-through caching, the images ar eunaltered from source.

@chipzoller
Copy link
Collaborator

Please confirm all your Pods are in a running state following installation with these values.

@matt-tvg
Copy link
Author

All containers in the cost-analyzer pod are fine bar cost-analyzer-frontend which crashloops with the above error and causes the pod to report as crashloopbackoff

The forecasting pod (1 container) is running fine.

@chipzoller
Copy link
Collaborator

I just performed an installation on EKS 1.31 (Kubecost 2.4.2) using the defaults and the eks-specific Helm values with no issues, although this does deploy the bundled Prometheus instance.

helm upgrade -i kubecost \
oci://public.ecr.aws/kubecost/cost-analyzer \
--namespace kubecost --create-namespace \
-f https://raw.githubusercontent.com/kubecost/cost-analyzer-helm-chart/v2.4/cost-analyzer/values-eks-cost-monitoring.yaml

Are you able to try this temporarily to see that it works for you?

@chipzoller
Copy link
Collaborator

Did this work for you?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-triage
Projects
None yet
Development

No branches or pull requests

2 participants