The openebs pod openEBs-etcd is restarting all the time, using logs to check, Report error Headless service domain does not have an IP per initial member in the cluster #3757
Replies: 4 comments 8 replies
-
helm repo add openebs https://openebs.github.io/openebs |
Beta Was this translation helpful? Give feedback.
-
Could we get the logs from openebs-etcd-xx ? |
Beta Was this translation helpful? Give feedback.
-
Openebs ETCD is a statefulset and uses headless service, headless service itself does not provide a cluster IP. Instead, it returns the individual IP addresses of the pods that are part of the service. This is typically used for stateful sets or other applications where you need to interact with individual pods directly. Headless service domain does not have an IP per initial member in the cluster, it usually means that the DNS records for the pods are not being correctly created or resolved. Make sure that the DNS service in your Kubernetes cluster is working correctly. You can test DNS resolution by running a pod and using the nslookup like below. Command: For Example: kubectl get pods -n mayastor -o wide | grep mayastor-etcd-0 root@master-velero:~/mayastor-extensions# kubectl run -i --tty --rm debug --image=busybox --restart=Never -- sh Name: mayastor-etcd-0.mayastor-etcd-headless.mayastor.svc.cluster.local |
Beta Was this translation helpful? Give feedback.
-
We suggest to check the dns configuration within the cluster as this is not related to Openebs, the ETCD is crashing because the DNS records for the pods are not being correctly created or resolved. |
Beta Was this translation helpful? Give feedback.
-
[root@k8s-master ~]# kubectl get node
NAME STATUS ROLES AGE VERSION
k8s-master Ready control-plane 103d v1.29.3
k8s-worker01 Ready 103d v1.29.3
k8s-worker02 Ready 103d v1.29.3
[root@k8s-master ~]# kubectl get pod -n openebs
NAME READY STATUS RESTARTS AGE
openebs-agent-core-7454f6cc79-cp5js 0/2 Init:0/1 0 5m53s
openebs-agent-ha-node-8x7zc 0/1 Init:0/1 0 5m55s
openebs-agent-ha-node-r8qwg 0/1 Init:0/1 0 5m55s
openebs-agent-ha-node-zx7qr 0/1 Init:0/1 0 5m55s
openebs-api-rest-5b44d6665c-gzxgm 0/1 Init:0/2 0 5m53s
openebs-csi-controller-5d9cbbbcd7-jzk7k 0/6 Init:0/1 0 5m53s
openebs-csi-node-djj9f 0/2 Init:0/1 0 5m55s
openebs-csi-node-rvxtc 0/2 Init:0/1 0 5m55s
openebs-csi-node-sncf8 0/2 Init:0/1 0 5m55s
openebs-etcd-0 0/1 Running 4 (68s ago) 5m52s
openebs-etcd-1 0/1 Running 4 (50s ago) 5m51s
openebs-etcd-2 0/1 Pending 0 5m51s
openebs-localpv-provisioner-55bf478db6-lf9w4 1/1 Running 0 5m54s
openebs-loki-0 1/1 Running 0 5m51s
openebs-lvm-localpv-controller-668c75f94f-mbjpf 5/5 Running 0 5m54s
openebs-lvm-localpv-node-7pjxs 2/2 Running 0 5m55s
openebs-lvm-localpv-node-lc89n 2/2 Running 0 5m55s
openebs-lvm-localpv-node-p9fn5 2/2 Running 0 5m55s
openebs-nats-0 3/3 Running 0 5m51s
openebs-nats-1 3/3 Running 0 5m51s
openebs-nats-2 3/3 Running 0 5m51s
openebs-obs-callhome-7d7d5799d6-njklv 2/2 Running 0 5m52s
openebs-operator-diskpool-f755cbd4b-lsxwl 0/1 Init:0/2 0 5m52s
openebs-promtail-cqgf8 1/1 Running 0 5m55s
openebs-promtail-hl5r9 1/1 Running 0 5m55s
openebs-promtail-l9xs6 1/1 Running 0 5m55s
openebs-zfs-localpv-controller-65d698cfcc-r55cg 5/5 Running 0 5m52s
openebs-zfs-localpv-node-5wq4z 2/2 Running 0 5m53s
openebs-zfs-localpv-node-brk6z 2/2 Running 0 5m53s
openebs-zfs-localpv-node-mskc6 2/2 Running 0 5m53s
Name: openebs-etcd-1
Namespace: openebs
Priority: 0
Service Account: default
Node: k8s-master/192.168.90.201
Start Time: Fri, 19 Jul 2024 16:51:29 +0800
Labels: app=etcd
app.kubernetes.io/instance=openebs
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=etcd
apps.kubernetes.io/pod-index=1
controller-revision-hash=openebs-etcd-85f7cdcd4d
helm.sh/chart=etcd-8.6.0
openebs.io/logging=true
statefulset.kubernetes.io/pod-name=openebs-etcd-1
Annotations: checksum/token-secret: 6aa39ecf19d61005e6126ef683eec1f5b5892336bb13de034c77611fff4a1607
cni.projectcalico.org/containerID: e2ce5e5e5dfd69ad3fa3b1311c33f26ec63a13559e44d44cabb8871cca6afdf4
cni.projectcalico.org/podIP: 10.244.235.205/32
cni.projectcalico.org/podIPs: 10.244.235.205/32
Status: Running
IP: 10.244.235.205
IPs:
IP: 10.244.235.205
Controlled By: StatefulSet/openebs-etcd
Init Containers:
volume-permissions:
Container ID: docker://5134f01a05d3a591d2ba127e24283cf0d75fd1b1416df4f905925312530ee628
Image: docker.io/bitnami/bitnami-shell:11-debian-11-r63
Image ID: docker-pullable://bitnami/bitnami-shell@sha256:b01a7fbc9f294c4ba1b911a947f48766834980c1247fa87d65dea8e0eeb819c3
Port:
Host Port:
Command:
/bin/bash
-ec
chown -R 1001:1001 /bitnami/etcd
Containers:
etcd:
Container ID: docker://e1fe0a23705c3d52277e0ec0564285ec7c9ba48e916cc285e589bd0aa09f2665
Image: docker.io/bitnami/etcd:3.5.6-debian-11-r10
Image ID: docker-pullable://bitnami/etcd@sha256:2d7b831769734bb97a5c1cfd2fe46e29f422b70b5ba9f9aedfd91300839ac3ee
Ports: 2379/TCP, 2380/TCP
Host Ports: 0/TCP, 0/TCP
State: Running
Started: Fri, 19 Jul 2024 16:57:15 +0800
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 19 Jul 2024 16:55:29 +0800
Finished: Fri, 19 Jul 2024 16:56:30 +0800
Ready: False
Restart Count: 4
Liveness: exec [/opt/bitnami/scripts/etcd/healthcheck.sh] delay=60s timeout=5s period=30s #success=1 #failure=5
Readiness: exec [/opt/bitnami/scripts/etcd/healthcheck.sh] delay=60s timeout=5s period=10s #success=1 #failure=5
Environment:
BITNAMI_DEBUG: false
MY_POD_IP: (v1:status.podIP)
MY_POD_NAME: openebs-etcd-1 (v1:metadata.name)
MY_STS_NAME: openebs-etcd
ETCDCTL_API: 3
ETCD_ON_K8S: yes
ETCD_START_FROM_SNAPSHOT: no
ETCD_DISASTER_RECOVERY: no
ETCD_NAME: $(MY_POD_NAME)
ETCD_DATA_DIR: /bitnami/etcd/data
ETCD_LOG_LEVEL: info
ALLOW_NONE_AUTHENTICATION: yes
ETCD_AUTH_TOKEN: jwt,priv-key=/opt/bitnami/etcd/certs/token/jwt-token.pem,sign-method=RS256,ttl=10m
ETCD_ADVERTISE_CLIENT_URLS: http://$(MY_POD_NAME).openebs-etcd-headless.openebs.svc.cluster.local:2379,http://openebs-etcd.openebs.svc.cluster.local:2379
ETCD_LISTEN_CLIENT_URLS: http://0.0.0.0:2379
ETCD_INITIAL_ADVERTISE_PEER_URLS: http://$(MY_POD_NAME).openebs-etcd-headless.openebs.svc.cluster.local:2380
ETCD_LISTEN_PEER_URLS: http://0.0.0.0:2380
ETCD_AUTO_COMPACTION_MODE: revision
ETCD_AUTO_COMPACTION_RETENTION: 100
ETCD_INITIAL_CLUSTER_TOKEN: etcd-cluster-k8s
ETCD_INITIAL_CLUSTER_STATE: new
ETCD_INITIAL_CLUSTER: openebs-etcd-0=http://openebs-etcd-0.openebs-etcd-headless.openebs.svc.cluster.local:2380,openebs-etcd-1=http://openebs-etcd-1.openebs-etcd-headless.openebs.svc.cluster.local:2380,openebs-etcd-2=http://openebs-etcd-2.openebs-etcd-headless.openebs.svc.cluster.local:2380
ETCD_CLUSTER_DOMAIN: openebs-etcd-headless.openebs.svc.cluster.local
ETCD_QUOTA_BACKEND_BYTES: 8589934592
Mounts:
/bitnami/etcd from data (rw)
/opt/bitnami/etcd/certs/token/ from etcd-jwt-token (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-m7xdk (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-openebs-etcd-1
ReadOnly: false
etcd-jwt-token:
Type: Secret (a volume populated by a Secret)
SecretName: openebs-etcd-jwt-token
Optional: false
kube-api-access-m7xdk:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional:
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
Normal Scheduled 6m38s default-scheduler Successfully assigned openebs/openebs-etcd-1 to k8s-master
Normal Pulled 6m29s kubelet Container image "docker.io/bitnami/bitnami-shell:11-debian-11-r63" already present on machine
Normal Created 6m29s kubelet Created container volume-permissions
Normal Started 6m28s kubelet Started container volume-permissions
Warning Unhealthy 3m7s kubelet Readiness probe errored: rpc error: code = Unknown desc = container not running (659981711e30501c301734b07da5e714575ae969279a06ca2fe5bacdf27f527e)
Warning Unhealthy 3m7s kubelet Liveness probe errored: rpc error: code = Unknown desc = container not running (659981711e30501c301734b07da5e714575ae969279a06ca2fe5bacdf27f527e)
Normal Pulled 2m39s (x4 over 6m26s) kubelet Container image "docker.io/bitnami/etcd:3.5.6-debian-11-r10" already present on machine
Normal Created 2m38s (x4 over 6m26s) kubelet Created container etcd
Normal Started 2m38s (x4 over 6m25s) kubelet Started container etcd
Warning Unhealthy 97s kubelet Readiness probe errored: rpc error: code = Unknown desc = container not running (6100a1642fc50aeee8b3148ddd2dc76ad710bff1b816290a196c6af403fcfbe9)
Warning Unhealthy 97s kubelet Liveness probe errored: rpc error: code = Unknown desc = container not running (6100a1642fc50aeee8b3148ddd2dc76ad710bff1b816290a196c6af403fcfbe9)
Warning BackOff 67s (x7 over 4m21s) kubelet Back-off restarting failed container etcd in pod openebs-etcd-1_openebs(72eb4986-0d12-4f45-9239-a8f747f369bc)
[root@k8s-master ~]# kubectl logs openebs-etcd-1 -c etcd -n openebs
etcd 08:57:15.36
etcd 08:57:15.37 Welcome to the Bitnami etcd container
etcd 08:57:15.37 Subscribe to project updates by watching https://github.com/bitnami/containers
etcd 08:57:15.39 Submit issues and feature requests at https://github.com/bitnami/containers/issues
etcd 08:57:15.39
etcd 08:57:15.39 INFO ==> ** Starting etcd setup **
etcd 08:57:15.41 INFO ==> Validating settings in ETCD_* env vars..
etcd 08:57:15.42 WARN ==> You set the environment variable ALLOW_NONE_AUTHENTICATION=yes. For safety reasons, do not use this flag in a production environment.
etcd 08:57:15.49 INFO ==> Initializing etcd
etcd 08:57:15.49 INFO ==> Generating etcd config file using env variables
etcd 08:57:15.63 INFO ==> There is no data from previous deployments
etcd 08:57:15.63 INFO ==> Bootstrapping a new cluster
etcd 08:58:16.23 ERROR ==> Headless service domain does not have an IP per initial member in the cluster
I installed helm using the official website, but it just kept going. Can anyone answer my questions? thank you
Beta Was this translation helpful? Give feedback.
All reactions