-
Notifications
You must be signed in to change notification settings - Fork 405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG]kubevirt deployed VM coulnd't be restarted when network is disconnected #1400
Comments
@gnunu would you be able to upload the detail logs of yurthub component and kubelet component? |
details should be uploaded a little alter by my colleagues. |
two nodes cluster, one control-plane (cloud) node, and one worker (edge) node
version of the key components (all the nodes are the same): box@joez-hce-ub20-vm-virt-m:~$ cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.5 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.5 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
box@joez-hce-ub20-vm-virt-m:~$ uname -a
Linux joez-hce-ub20-vm-virt-m 5.4.0-147-generic #164-Ubuntu SMP Tue Mar 21 14:23:17 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
box@joez-hce-ub20-vm-virt-m:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.0", GitCommit:"ab69524f795c42094a6630298ff53f3c3ebab7f4", GitTreeState:"clean", BuildDate:"2021-12-07T18:16:20Z", GoVersion:"go1.17.3", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.0", GitCommit:"ab69524f795c42094a6630298ff53f3c3ebab7f4", GitTreeState:"clean", BuildDate:"2021-12-07T18:09:57Z", GoVersion:"go1.17.3", Compiler:"gc", Platform:"linux/amd64"}
box@joez-hce-ub20-vm-virt-m:~$ docker version
Client: Docker Engine - Community
Version: 23.0.4
API version: 1.42
Go version: go1.19.8
Git commit: f480fb1
Built: Fri Apr 14 10:32:23 2023
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 23.0.4
API version: 1.42 (minimum version 1.12)
Go version: go1.19.8
Git commit: cbce331
Built: Fri Apr 14 10:32:23 2023
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.20
GitCommit: 2806fc1057397dbaeefbea0e4e17bddfbd388f38
runc:
Version: 1.1.5
GitCommit: v1.1.5-0-gf19387a
docker-init:
Version: 0.19.0
GitCommit: de40ad0
box@joez-hce-ub20-vm-virt-m:~$ virtctl version
Client Version: version.Info{GitVersion:"v0.58.0", GitCommit:"6e41ae7787c1b48ac9a633c61a54444ea947242c", GitTreeState:"clean", BuildDate:"2022-10-13T00:33:22Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{GitVersion:"v0.58.0", GitCommit:"6e41ae7787c1b48ac9a633c61a54444ea947242c", GitTreeState:"clean", BuildDate:"2022-10-13T00:33:22Z", GoVersion:"go1.17.8", Compiler:"gc", Platform:"linux/amd64"} before shutdown cloud node and reboot edge node, everything works fine: box@joez-hce-ub20-vm-virt-m:~$ kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
default nginx-85b98978db-hvn2z 1/1 Running 1 (20m ago) 68m
default virt-launcher-testvm-7hxv5 2/2 Running 0 34s
kube-flannel kube-flannel-ds-dkxk2 1/1 Running 1 (20m ago) 28h
kube-flannel kube-flannel-ds-xh79b 1/1 Running 1 (20m ago) 28h
kube-system coredns-6d8c4cb4d-67l78 1/1 Running 1 (20m ago) 28h
kube-system coredns-6d8c4cb4d-mdhrt 1/1 Running 1 (7m6s ago) 28h
kube-system etcd-joez-hce-ub20-vm-virt-m 1/1 Running 1 (7m6s ago) 28h
kube-system kube-apiserver-joez-hce-ub20-vm-virt-m 1/1 Running 2 (20m ago) 61m
kube-system kube-controller-manager-joez-hce-ub20-vm-virt-m 1/1 Running 2 (20m ago) 28h
kube-system kube-proxy-jph4x 1/1 Running 1 (20m ago) 28h
kube-system kube-proxy-sqqck 1/1 Running 1 (20m ago) 28h
kube-system kube-scheduler-joez-hce-ub20-vm-virt-m 1/1 Running 2 (20m ago) 28h
kube-system yurt-app-manager-b8677d956-4b9pf 1/1 Running 6 (20m ago) 27h
kube-system yurt-controller-manager-7787f67564-jmjcb 1/1 Running 2 (7m6s ago) 3h3m
kube-system yurt-hub-joez-hce-ub20-vm-virt-w 1/1 Running 1 (20m ago) 143m
kubevirt virt-api-69d978dd67-rp8np 1/1 Running 1 (20m ago) 37m
kubevirt virt-api-69d978dd67-t4552 1/1 Running 1 (20m ago) 37m
kubevirt virt-controller-695cc98c56-fkzsx 1/1 Running 1 (7m6s ago) 37m
kubevirt virt-controller-695cc98c56-j4wxv 1/1 Running 1 (20m ago) 37m
kubevirt virt-handler-q5sqh 1/1 Running 1 (20m ago) 37m
kubevirt virt-handler-wdtp4 1/1 Running 1 (7m6s ago) 37m
kubevirt virt-operator-58cb8475bb-6mswb 1/1 Running 1 (20m ago) 38m
kubevirt virt-operator-58cb8475bb-t74df 1/1 Running 1 (20m ago) 38m
box@joez-hce-ub20-vm-virt-w:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
149fa086b94e quay.io/kubevirt/cirros-container-disk-demo "/usr/bin/container-…" About a minute ago Up About a minute k8s_volumecontainerdisk_virt-launcher-testvm-7hxv5_default_1d656c94-1395-4deb-89f8-0f844d989e52_0
cfdbd721ece4 a3a2b8b0c675 "/usr/bin/virt-launc…" About a minute ago Up About a minute k8s_compute_virt-launcher-testvm-7hxv5_default_1d656c94-1395-4deb-89f8-0f844d989e52_0
0b81204195b8 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" About a minute ago Up About a minute k8s_POD_virt-launcher-testvm-7hxv5_default_1d656c94-1395-4deb-89f8-0f844d989e52_0
85951e033c26 nginx "/docker-entrypoint.…" 7 minutes ago Up 7 minutes k8s_nginx_nginx-85b98978db-hvn2z_default_b91fe8a0-253e-42e4-843f-199967d87a9d_1
d035475e97e4 c407633b131b "virt-handler --port…" 7 minutes ago Up 7 minutes k8s_virt-handler_virt-handler-q5sqh_kubevirt_e2d38ee6-80f3-4360-8321-cbf3b40d1985_1
ed2063dcca41 f76a3af5e135 "virt-controller --l…" 7 minutes ago Up 7 minutes k8s_virt-controller_virt-controller-695cc98c56-j4wxv_kubevirt_1021a9cd-e233-470e-8ca3-4979315c31a4_1
90ec4ddecc9f registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 7 minutes ago Up 7 minutes k8s_POD_virt-controller-695cc98c56-j4wxv_kubevirt_1021a9cd-e233-470e-8ca3-4979315c31a4_4
fdd690ab00f0 a7186007b4a9 "/usr/local/bin/yurt…" 7 minutes ago Up 7 minutes k8s_yurt-app-manager_yurt-app-manager-b8677d956-4b9pf_kube-system_8a4afc07-7d42-4e64-b0b9-27b344eec936_6
7d2562e58b79 e05304a0fbaf "virt-operator --por…" 7 minutes ago Up 7 minutes k8s_virt-operator_virt-operator-58cb8475bb-6mswb_kubevirt_46eb2665-d9b0-4c51-9827-f87ac1ab8985_1
69852e3414e7 943b496a674d "virt-api --port 844…" 7 minutes ago Up 7 minutes k8s_virt-api_virt-api-69d978dd67-rp8np_kubevirt_634e5490-d783-4fdb-ba11-3bf1558b37ae_1
c77366a0ac41 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 7 minutes ago Up 7 minutes k8s_POD_yurt-app-manager-b8677d956-4b9pf_kube-system_8a4afc07-7d42-4e64-b0b9-27b344eec936_3
fbaedb517e8b registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 7 minutes ago Up 7 minutes k8s_POD_virt-handler-q5sqh_kubevirt_e2d38ee6-80f3-4360-8321-cbf3b40d1985_4
3ebdbf1da63c registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 7 minutes ago Up 7 minutes k8s_POD_virt-operator-58cb8475bb-6mswb_kubevirt_46eb2665-d9b0-4c51-9827-f87ac1ab8985_4
6009f65b3444 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 7 minutes ago Up 7 minutes k8s_POD_nginx-85b98978db-hvn2z_default_b91fe8a0-253e-42e4-843f-199967d87a9d_3
bb946f591b78 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 7 minutes ago Up 7 minutes k8s_POD_virt-api-69d978dd67-rp8np_kubevirt_634e5490-d783-4fdb-ba11-3bf1558b37ae_3
702909f7a174 11ae74319a21 "/opt/bin/flanneld -…" 7 minutes ago Up 7 minutes k8s_kube-flannel_kube-flannel-ds-dkxk2_kube-flannel_37eb9f49-5338-4fb6-bd97-563d0ff098be_1
3a8b0a92ff21 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 7 minutes ago Up 7 minutes k8s_POD_kube-flannel-ds-dkxk2_kube-flannel_37eb9f49-5338-4fb6-bd97-563d0ff098be_1
9c92d6fdedde e03484a90585 "/usr/local/bin/kube…" 7 minutes ago Up 7 minutes k8s_kube-proxy_kube-proxy-sqqck_kube-system_90fecc3f-31b1-4ba6-a825-1c0fa2db64d6_1
7acff52536ea registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 7 minutes ago Up 7 minutes k8s_POD_kube-proxy-sqqck_kube-system_90fecc3f-31b1-4ba6-a825-1c0fa2db64d6_1
b0b78516b422 f4fba699ab86 "yurthub --v=2 --ser…" 20 minutes ago Up 20 minutes k8s_yurt-hub_yurt-hub-joez-hce-ub20-vm-virt-w_kube-system_21482483ffe45101b48a34a036517322_1
5d79eea5b086 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 20 minutes ago Up 20 minutes k8s_POD_yurt-hub-joez-hce-ub20-vm-virt-w_kube-system_21482483ffe45101b48a34a036517322_1 then, shutdown cloud node box@joez-hce-ub20-vm-virt-m:~$ sudo shutdown now
Connection to joez-hce-ub20-vm-virt-m closed by remote host.
Connection to joez-hce-ub20-vm-virt-m closed. wait for more than 1 minute, both nginx and kubevirt vm workload are still running on edge node box@joez-hce-ub20-vm-virt-w:~$ ps -ef | grep qemu
root 12371 12346 0 12:12 ? 00:00:00 /usr/bin/virt-launcher-monitor --qemu-timeout 241s --name testvm --uid e67efffd-d2d1-464a-8d3f-9ae347bd9c60 --namespace default --kubevirt-share-dir /var/run/kubevirt --ephemeral-disk-dir /var/run/kubevirt-ephemeral-disks --container-disk-dir /var/run/kubevirt/container-disks --grace-period-seconds 45 --hook-sidecars 0 --ovmf-path /usr/share/OVMF --keep-after-failure
root 12390 12371 0 12:12 ? 00:00:00 /usr/bin/virt-launcher --qemu-timeout 241s --name testvm --uid e67efffd-d2d1-464a-8d3f-9ae347bd9c60 --namespace default --kubevirt-share-dir /var/run/kubevirt --ephemeral-disk-dir /var/run/kubevirt-ephemeral-disks --container-disk-dir /var/run/kubevirt/container-disks --grace-period-seconds 45 --hook-sidecars 0 --ovmf-path /usr/share/OVMF
uuidd 12635 12371 3 12:12 ? 00:00:10 /usr/libexec/qemu-kvm -name guest=default_testvm,debug-threads=on -S -object {"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-1-default_testvm/master-key.aes"}
box@joez-hce-ub20-vm-virt-w:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
0ba2f3224321 f76a3af5e135 "virt-controller --l…" 8 seconds ago Up 7 seconds k8s_virt-controller_virt-controller-695cc98c56-j4wxv_kubevirt_1021a9cd-e233-470e-8ca3-4979315c31a4_3
981719c7d9ae c407633b131b "virt-handler --port…" 15 seconds ago Up 14 seconds k8s_virt-handler_virt-handler-q5sqh_kubevirt_e2d38ee6-80f3-4360-8321-cbf3b40d1985_2
149fa086b94e quay.io/kubevirt/cirros-container-disk-demo "/usr/bin/container-…" 6 minutes ago Up 6 minutes k8s_volumecontainerdisk_virt-launcher-testvm-7hxv5_default_1d656c94-1395-4deb-89f8-0f844d989e52_0
cfdbd721ece4 a3a2b8b0c675 "/usr/bin/virt-launc…" 6 minutes ago Up 6 minutes k8s_compute_virt-launcher-testvm-7hxv5_default_1d656c94-1395-4deb-89f8-0f844d989e52_0
0b81204195b8 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 6 minutes ago Up 6 minutes k8s_POD_virt-launcher-testvm-7hxv5_default_1d656c94-1395-4deb-89f8-0f844d989e52_0
85951e033c26 nginx "/docker-entrypoint.…" 12 minutes ago Up 12 minutes k8s_nginx_nginx-85b98978db-hvn2z_default_b91fe8a0-253e-42e4-843f-199967d87a9d_1
90ec4ddecc9f registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 12 minutes ago Up 12 minutes k8s_POD_virt-controller-695cc98c56-j4wxv_kubevirt_1021a9cd-e233-470e-8ca3-4979315c31a4_4
69852e3414e7 943b496a674d "virt-api --port 844…" 12 minutes ago Up 12 minutes k8s_virt-api_virt-api-69d978dd67-rp8np_kubevirt_634e5490-d783-4fdb-ba11-3bf1558b37ae_1
c77366a0ac41 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 12 minutes ago Up 12 minutes k8s_POD_yurt-app-manager-b8677d956-4b9pf_kube-system_8a4afc07-7d42-4e64-b0b9-27b344eec936_3
fbaedb517e8b registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 12 minutes ago Up 12 minutes k8s_POD_virt-handler-q5sqh_kubevirt_e2d38ee6-80f3-4360-8321-cbf3b40d1985_4
3ebdbf1da63c registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 12 minutes ago Up 12 minutes k8s_POD_virt-operator-58cb8475bb-6mswb_kubevirt_46eb2665-d9b0-4c51-9827-f87ac1ab8985_4
6009f65b3444 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 12 minutes ago Up 12 minutes k8s_POD_nginx-85b98978db-hvn2z_default_b91fe8a0-253e-42e4-843f-199967d87a9d_3
bb946f591b78 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 12 minutes ago Up 12 minutes k8s_POD_virt-api-69d978dd67-rp8np_kubevirt_634e5490-d783-4fdb-ba11-3bf1558b37ae_3
702909f7a174 11ae74319a21 "/opt/bin/flanneld -…" 12 minutes ago Up 12 minutes k8s_kube-flannel_kube-flannel-ds-dkxk2_kube-flannel_37eb9f49-5338-4fb6-bd97-563d0ff098be_1
3a8b0a92ff21 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 12 minutes ago Up 12 minutes k8s_POD_kube-flannel-ds-dkxk2_kube-flannel_37eb9f49-5338-4fb6-bd97-563d0ff098be_1
9c92d6fdedde e03484a90585 "/usr/local/bin/kube…" 12 minutes ago Up 12 minutes k8s_kube-proxy_kube-proxy-sqqck_kube-system_90fecc3f-31b1-4ba6-a825-1c0fa2db64d6_1
7acff52536ea registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 12 minutes ago Up 12 minutes k8s_POD_kube-proxy-sqqck_kube-system_90fecc3f-31b1-4ba6-a825-1c0fa2db64d6_1
b0b78516b422 f4fba699ab86 "yurthub --v=2 --ser…" 25 minutes ago Up 25 minutes k8s_yurt-hub_yurt-hub-joez-hce-ub20-vm-virt-w_kube-system_21482483ffe45101b48a34a036517322_1
5d79eea5b086 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 25 minutes ago Up 25 minutes k8s_POD_yurt-hub-joez-hce-ub20-vm-virt-w_kube-system_21482483ffe45101b48a34a036517322_1 now, retart edge node, and keep cloud node down box@joez-hce-ub20-vm-virt-w:~$ sudo reboot
[sudo] password for box:
Connection to 10.67.108.242 closed by remote host.
Connection to 10.67.108.242 closed. after reboot, both the nginx and kubevirt vm are not launched box@joez-hce-ub20-vm-virt-w:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9a5b19370a11 f4fba699ab86 "yurthub --v=2 --ser…" 13 minutes ago Up 13 minutes k8s_yurt-hub_yurt-hub-joez-hce-ub20-vm-virt-w_kube-system_21482483ffe45101b48a34a036517322_2
cea81f7a1578 registry.aliyuncs.com/google_containers/pause:3.6 "/pause" 13 minutes ago Up 13 minutes k8s_POD_yurt-hub-joez-hce-ub20-vm-virt-w_kube-system_21482483ffe45101b48a34a036517322_2 |
Here are my steps to setup OpenYurt cluster and deploy KubeVirt: label nodes and activate node autonomy: cloud_node=$(kubectl get node -l node-role.kubernetes.io/master -o name | sed -e s:node/::)
edge_node=$(kubectl get node -o name | grep -v $cloud_node | sed -e s:node/::)
kubectl label node $cloud_node openyurt.io/is-edge-worker=false
kubectl label node $edge_node openyurt.io/is-edge-worker=true
kubectl annotate node $edge_node node.beta.openyurt.io/autonomy=true deploy control-plane components on cloud node: helm repo add openyurt https://openyurtio.github.io/openyurt-helm
# deploy yurt-app-manager first
helm upgrade --install -n kube-system yurt-app-manager openyurt/yurt-app-manager
# then yurt-controller-manager
helm upgrade --install -n kube-system --version 1.2.0 openyurt openyurt/openyurt
# check the result
helm list -A
# openyurt-1.2.0 1.2.0
# yurt-app-manager-0.1.3 0.6.0
kubectl get po -A | grep yurt setup yurthub on edge node: # find your kube-apiserver and token
kube_api=10.67.108.194:6443
token=0ide56.gzkntj0zwbh2qhfe
# deploy yurthub
curl -LO https://raw.githubusercontent.com/openyurtio/openyurt/release-v1.2/config/setup/yurthub.yaml
sed "s/__kubernetes_master_address__/$kube_api/;s/__bootstrap_token__/$token/" yurthub.yaml | sudo tee /etc/kubernetes/manifests/yurthub.yaml
# create kubeconfig
sudo mkdir -p /var/lib/openyurt
cat << EOF | sudo tee /var/lib/openyurt/kubelet.conf
apiVersion: v1
clusters:
- cluster:
server: http://127.0.0.1:10261
name: default-cluster
contexts:
- context:
cluster: default-cluster
namespace: default
user: default-auth
name: default-context
current-context: default-context
kind: Config
preferences: {}
EOF
# let kubelet to use the new kubeconfig
sudo sed -i.bak 's#KUBELET_KUBECONFIG_ARGS=.*"#KUBELET_KUBECONFIG_ARGS=--kubeconfig=/var/lib/openyurt/kubelet.conf"#g' /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
# restart kubelet
sudo systemctl daemon-reload
sudo systemctl restart kubelet
# check status
sudo systemctl status kubelet deploy KubeVirt: VERSION=v0.58.0
kubectl create -f https://github.com/kubevirt/kubevirt/releases/download/${VERSION}/kubevirt-operator.yaml
kubectl create -f https://github.com/kubevirt/kubevirt/releases/download/${VERSION}/kubevirt-cr.yaml
# wait until kubevirt.kubevirt.io/kubevirt is deployed
kubectl get -n kubevirt kv/kubevirt -w Deploy virtctl: VERSION=$(kubectl get kubevirt.kubevirt.io/kubevirt -n kubevirt -o=jsonpath="{.status.observedKubeVirtVersion}")
ARCH=$(uname -s | tr A-Z a-z)-$(uname -m | sed 's/x86_64/amd64/')
curl -L -o virtctl https://github.com/kubevirt/kubevirt/releases/download/${VERSION}/virtctl-${VERSION}-${ARCH}
chmod +x virtctl
sudo install virtctl /usr/local/bin Deploy VM for test: kubectl apply -f https://kubevirt.io/labs/manifests/vm.yaml
kubectl get vms
# start VM
virtctl start testvm
# check status
kubectl get vmis
# access console
virtctl console testvm |
after boot up the cloud node again, all the pods are started again on the edge node
|
@rambohe-ch @joez in this case, the master node is shutdown, I am not sure if that's considered fully in OpenYurt. When master is down, is yurthub still healthy enough for kubelet? |
Yes, we expect that pods on edge can recover even master is down. @gnunu Firstly, I've to say that openyurt+kubevirt has not been tested yet. From my perspective, yurthub provides an edge local cache for generic usage, and it can support the recovery of kubevirt theorytically. Yurthub will not have resources cache for all edge components, and in this case I think the kubevirt related resources were not cached. You may check the cache-agent configmap to see if you've enable yurthub to make cache for kubevirt.
However, it seems that the openyurt cluster was in an abnormal situtation. We expect that the cache for kubelet should contains pods, configmaps and some others which enable the pod recovery when master has shutdown. It should be like as following:
Maybe there's something wrong in yurthub. Could you check log of yurthub on worker node when master is running? It should cache theses resources from master when everything is ok. @joez |
@Congrool let us check the container workload (nginx) first, and KubeVirt VM as the next step Here is output when master node is connected:
The log of yurthub: yurthub-normal.txt
Current yurt-hub-cfg:
|
After enabling cache for all edge components by setting
And then disconnect edge node from cloud node by appling iptables rules on the cloud node: box@joez-hce-ub20-vm-virt-m:~$ kubectl get no -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
joez-hce-ub20-vm-virt-m Ready control-plane,master 3d15h v1.23.0 10.67.108.194 <none> Ubuntu 20.04.5 LTS 5.4.0-147-generic docker://23.0.4
joez-hce-ub20-vm-virt-w Ready <none> 3d15h v1.23.0 10.67.108.242 <none> Ubuntu 20.04.5 LTS 5.4.0-147-generic docker://23.0.4
box@joez-hce-ub20-vm-virt-m:~$ sudo iptables -I OUTPUT -d 10.67.108.242 -j DROP
box@joez-hce-ub20-vm-virt-m:~$ kubectl get node
NAME STATUS ROLES AGE VERSION
joez-hce-ub20-vm-virt-m Ready control-plane,master 3d15h v1.23.0
joez-hce-ub20-vm-virt-w NotReady <none> 3d15h v1.23.0 After reboot the edge node, more pod are launched, but most of them exit immediately
Flannel is failed to start:
Can't connect to api-server via kube-proxy box@joez-hce-ub20-vm-virt-w:/etc/kubernetes/cache$ nc -zv 10.96.0.1 443
nc: connect to 10.96.0.1 port 443 (tcp) failed: Connection refused
box@joez-hce-ub20-vm-virt-w:/etc/kubernetes/cache$ sudo iptables-save | grep -w 10.96.0.1
# OK on cloud node
box@joez-hce-ub20-vm-virt-m:~$ nc -zv 10.96.0.1 443
Connection to 10.96.0.1 443 port [tcp/https] succeeded!
box@joez-hce-ub20-vm-virt-m:~$ sudo iptables-save | grep -w 10.96.0.1
-A KUBE-SERVICES -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-SVC-NPX46M4PTMTKRN6Y
-A KUBE-SVC-NPX46M4PTMTKRN6Y ! -s 10.244.0.0/16 -d 10.96.0.1/32 -p tcp -m comment --comment "default/kubernetes:https cluster IP" -m tcp --dport 443 -j KUBE-MARK-MASQ Check kube-proxy:
The kube-proxy is still trying to get information from cloud node, instead of yurthub, is it the expected behavior? |
Thanks for your detailed logs. I'm not sure why the component "go-http-client" list for pods and configmaps, and what it is.
It is expected to be "kubelet list pods". I noticed that the kubernetes cluster is
I think this can explain some of the problems we encountered.
Because kubelet also use another User-Agent called
This is not what we expected. kube-proxy should fetch resources through yurthub. We use filter in yurthub to do it. In normal case, we may find such log of yurthub:
So the configmap mounted by kube-proxy should be modified by yurthub to make the kube-proxy use InClusterConfig, which will enable it fetch resource through yurthub. This configmap is fetched by kubelet, thus in yurthub we identify this configmap using the User-Agent of kubelet requests, which originally should be So in summary, these problems seems to be introduced by kubernetes v1.23.x, and I think we have to find a solution for it. Currently, to make it work around, could you please use kubernetes v1.22.x to have a try? @joez |
Thanks for your explaination, I will have kubernetes 1.22.0 a try, and keep you posted later BTW. I choose 1.23.0 because getting-started says it supports
|
With the new cluster with k8s 1.22.0, the cache is as expected now.
But kube-proxy is still trying to connect to kube-apiserver instead of yurthub
And kube-proxy as well as flannel, nginx can be launched
But there are still two problems:
# no VM is running
box@joez-hce-ub20-vm-oykv-w:~$ ps -ef | grep qemu | grep -v grep
box@joez-hce-ub20-vm-oykv-w:~$ docker ps | grep virt-handler | grep -v POD | awk '{print $1}'
90da7040e86a
box@joez-hce-ub20-vm-oykv-w:~$ docker logs 90da7040e86a 2>&1 | less
W0425 14:03:54.447851 8650 client_config.go:617] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.
{"Unable to mark node as unschedulable":"can not cache for go-http-client patch nodes: /api/v1/nodes/joez-hce-ub20-vm-oykv-w","component":"virt-handler","level":"error","pos":"virt-handler.go:179","timestamp":"2023-04-25T14:03:54.503437Z"}
{"component":"virt-handler","level":"info","msg":"set verbosity to 2","pos":"virt-handler.go:471","timestamp":"2023-04-25T14:03:54.505689Z"}
...
# kube-proxy can't get service objects
box@joez-hce-ub20-vm-oykv-w:~$ docker ps | grep kube-proxy | grep -v POD | awk '{print $1}'
440188760b30
box@joez-hce-ub20-vm-oykv-w:~$ docker logs 440188760b30 2>&1 | less
I0425 14:03:47.152925 1 server.go:553] Neither kubeconfig file nor master URL was specified. Falling back to in-cluster config.
E0425 14:04:34.012669 1 node.go:161] Failed to retrieve node info: Get "https://169.254.2.1:10268/api/v1/nodes/joez-hce-ub20-vm-oykv-w": Service Unavailable
I0425 14:04:34.012744 1 server.go:836] can't determine this node's IP, assuming 127.0.0.1; if this is incorrect, please set the --bind-address flag
I0425 14:04:34.013173 1 server_others.go:140] Detected node IP 127.0.0.1
W0425 14:04:34.013292 1 server_others.go:565] Unknown proxy mode "", assuming iptables proxy
I0425 14:04:34.071984 1 server_others.go:206] kube-proxy running in dual-stack mode, IPv4-primary
I0425 14:04:34.072103 1 server_others.go:212] Using iptables Proxier.
I0425 14:04:34.072131 1 server_others.go:219] creating dualStackProxier for iptables.
W0425 14:04:34.072193 1 server_others.go:495] detect-local-mode set to ClusterCIDR, but no IPv6 cluster CIDR defined, , defaulting to no-op detect-local for IPv6
I0425 14:04:34.073932 1 server.go:649] Version: v1.22.0
I0425 14:04:34.078625 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_max' to 131072
I0425 14:04:34.078684 1 conntrack.go:52] Setting nf_conntrack_max to 131072
I0425 14:04:34.079048 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
I0425 14:04:34.079245 1 conntrack.go:100] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
I0425 14:04:34.080453 1 config.go:315] Starting service config controller
I0425 14:04:34.080490 1 shared_informer.go:240] Waiting for caches to sync for service config
I0425 14:04:34.080534 1 config.go:224] Starting endpoint slice config controller
I0425 14:04:34.080542 1 shared_informer.go:240] Waiting for caches to sync for endpoint slice config
E0425 14:04:37.012366 1 event_broadcaster.go:262] Unable to write event: 'Post "https://169.254.2.1:10268/apis/events.k8s.io/v1/namespaces/default/events": Service Unavailable' (may retry after sleeping)
E0425 14:04:37.012666 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.Service: failed to list *v1.Service: Get "https://169.254.2.1:10268/api/v1/services?labelSelector=%21service.kubernetes.io%2Fheadless%2C%21service.kubernetes.io%2Fservice-proxy-name&limit=500&resourceVersion=0": Service Unavailable
...
E0425 15:00:28.395624 1 reflector.go:138] k8s.io/client-go/informers/factory.go:134: Failed to watch *v1.EndpointSlice: failed to list *v1.EndpointSlice: Get "https://169.254.2.1:10268/apis/discovery.k8s.io/v1/endpointslices?labelSelector=%21service.kubernetes.io%2Fheadless%2C%21service.kubernetes.io%2Fservice-proxy-name&limit=500&resourceVersion=0": Service Unavailable
box@joez-hce-ub20-vm-oykv-w:~$ ip a
...
4: yurthub-dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether 56:f9:05:f9:60:8b brd ff:ff:ff:ff:ff:ff
inet 169.254.2.1/32 scope global yurthub-dummy0
valid_lft forever preferred_lft forever
box@joez-hce-ub20-vm-oykv-w:~$ sudo ss -lntp
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 4096 169.254.2.1:10261 0.0.0.0:* users:(("yurthub",pid=1600,fd=9))
LISTEN 0 4096 127.0.0.1:10261 0.0.0.0:* users:(("yurthub",pid=1600,fd=8))
LISTEN 0 4096 127.0.0.53%lo:53 0.0.0.0:* users:(("systemd-resolve",pid=690,fd=13))
LISTEN 0 128 0.0.0.0:22 0.0.0.0:* users:(("sshd",pid=1746,fd=3))
LISTEN 0 4096 127.0.0.1:10267 0.0.0.0:* users:(("yurthub",pid=1600,fd=7))
LISTEN 0 4096 169.254.2.1:10268 0.0.0.0:* users:(("yurthub",pid=1600,fd=10))
LISTEN 0 4096 127.0.0.1:10248 0.0.0.0:* users:(("kubelet",pid=718,fd=18))
LISTEN 0 4096 127.0.0.1:10249 0.0.0.0:* users:(("kube-proxy",pid=2686,fd=18))
LISTEN 0 4096 127.0.0.1:34505 0.0.0.0:* users:(("kubelet",pid=718,fd=12))
LISTEN 0 4096 *:10256 *:* users:(("kube-proxy",pid=2686,fd=19))
LISTEN 0 128 [::]:22 [::]:* users:(("sshd",pid=1746,fd=4))
LISTEN 0 4096 *:10250 *:* users:(("kubelet",pid=718,fd=35)) |
@rambohe-ch @Congrool would you help to check the kube-proxy issue? This issue prevent service to service communication from working As I mentioned last time, the kube-proxy does not work as expected:
It should have setup iptables rules in normal case, like following:
So, maybe something wrong which cause kube-proxy can't get enough information from yurt-hub to setup iptables rules. |
@joez Hey, sorry for late reply. It seems that the kube-proxy cannot access yurthub server. We may need to check if yurthub server still works. You can use following cmd on your host
In normal case, yurthub will use the node cache of kube-proxy component under And, could you post your kube-proxy version? BTW, in my cluster, the kube-proxy is v1.22.7. |
I have two k8s cluster currently, joez-hce-ub20-vm-virt-{m,w} is v1.23.0 and joez-hce-ub20-vm-oykv-{m,w} is v1.22.0, let us focus on the later one The kube-proxy version is v1.22.0
There is no such cache
Access to the port 10261 is OK by cURL:
Accessing 127.0.0.1:10261 is the same as 169.254.2.1:10268? I see the error in the logs:
The no_proxy variable in the kube-proxy container does not cover 169.254.2.1, maybe I need to add Automatic Private IP Addressing (APIPA) range into it.
|
Well, It's strange that there's no cache for kube-proxy. It should be there to make kube-proxy work when worker offline, like:
You can make the worker connect to the master, and then restart the kube-proxy on worker node at which time yurthub will cache the response from master. Could you have a try? After cache being created, kube-proxy can restart and work even worker is offline.
Yes, actually yurthub server listens on both addresses with same handler.
It should not get only the status code, but also the json data of node resource, like:
|
The kube-proxy issue is because of the wrong no_proxy setting, add the APIPA range into it, kube-proxy works fine now:
It's time to check virt-handler now, I think it is still trying to talk to kube-apiserver
|
To be honest, I'm not familiar with kubevirt. I can only check the situation from the view of yurthub. I'm not sure how many kinds of kubevirt related components will run on worker nodes. I saw that the cache has already contained something like And another question, does virt-handler have its kubeconfig? If so, I think we can remove such kubeconfig to make it use InClusterConfig, which will enable virt-handler to send requests to yurthub instead of apiserver. |
@Congrool Thank you very much. Maybe I am the first one to use KubeVirt on OpenYurt, but I think more and more users will choose KubeVirt if they want to orchestrate VM workload (such as app on Windows), and OpenYurt if they require edge autonomy. I will check KubeVirt further, the solution should be similar as kube-proxy, we need to customize it, as you mentioned to use InClusterConfig. But I don't understand how kube-proxy work with yurt-hub, would you show me some document in details? |
@joez I can give you some details of it. You can check the doc of yurthub which gives a rough description of the The previous one mainly affects the kubelet when it creates pods and sets envs in it. To be specific, it will change clusterIP and port of the kuberentes service that kubelet got from kube-apiserver. You can verify it by
whose Based on the MasterService Filter, what we need to do is that making sure all components use InClusterConfig, while the kube-proxy will use configmap |
From the error message of
The user agent is
The
Don't know why it is still failed to get the
The error is from // localReqCache handles Get/List/Update requests when remote servers are unhealthy
func (lp *LocalProxy) localReqCache(w http.ResponseWriter, req *http.Request) error {
if !lp.cacheMgr.CanCacheFor(req) {
klog.Errorf("can not cache for %s", hubutil.ReqString(req))
return apierrors.NewBadRequest(fmt.Sprintf("can not cache for %s", hubutil.ReqString(req)))
} @Congrool Would you shed some light on why the request can not be cached? |
I think I am almost approaching our target, except the "can not cache" error from @rambohe-ch @Congrool Would you help on this? I don't know why these resources can't be cached box@joez-hce-ub20-vm-oykv-w:~$ docker logs fb5695ee3d8b
...
{"component":"virt-handler","level":"info","msg":"STARTING informer vmiInformer-targets","pos":"virtinformers.go:330","timestamp":"2023-05-10T04:25:54.720361Z"}
W0510 04:25:54.751331 7102 reflector.go:324] pkg/controller/virtinformers.go:331: failed to list *v1.VirtualMachineInstance: can not cache for go-http-client list virtualmachineinstances: /apis/kubevirt.io/v1alpha3/virtualmachineinstances?labelSelector=kubevirt.io%2FnodeName+in+%28joez-hce-ub20-vm-oykv-w%29&limit=500&resourceVersion=0
E0510 04:25:54.751694 7102 reflector.go:138] pkg/controller/virtinformers.go:331: Failed to watch *v1.VirtualMachineInstance: failed to list *v1.VirtualMachineInstance: can not cache for go-http-client list virtualmachineinstances: /apis/kubevirt.io/v1alpha3/virtualmachineinstances?labelSelector=kubevirt.io%2FnodeName+in+%28joez-hce-ub20-vm-oykv-w%29&limit=500&resourceVersion=0 Here are the related logs from
What I have done are:
box@joez-hce-ub20-vm-oykv-w:~$ docker exec -it 2deb25f5272a sh
/ # nslookup nginx
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: nginx
Address 1: 10.96.2.116 nginx.default.svc.cluster.local
/ # nslookup virt-api.kubevirt
Server: 10.96.0.10
Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local
Name: virt-api.kubevirt
Address 1: 10.96.206.147 virt-api.kubevirt.svc.cluster.local
/ # ping virt-api.kubevirt
PING virt-api.kubevirt (10.96.206.147): 56 data bytes
64 bytes from 10.96.206.147: seq=0 ttl=241 time=184.707 ms
64 bytes from 10.96.206.147: seq=1 ttl=241 time=202.301 ms
^C
--- virt-api.kubevirt ping statistics ---
3 packets transmitted, 2 packets received, 33% packet loss
round-trip min/avg/max = 184.707/193.504/202.301 ms
/ # exit
box@joez-hce-ub20-vm-oykv-w:~$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fb5695ee3d8b c407633b131b "virt-handler --port…" 3 minutes ago Up 3 minutes k8s_virt-handler_virt-handler-7q56b_kubevirt_60b59abb-295d-455f-b8a1-a248a5656d7a_8
38068327ac79 alpine "/bin/sh" 3 minutes ago Up 3 minutes k8s_test_test_default_962446d6-f99d-41d8-b650-6af03f8a007f_10
d6a7f0c799bf nginx "/docker-entrypoint.…" 3 minutes ago Up 3 minutes k8s_nginx_nginx-6799fc88d8-j7fwc_default_13d7e957-bc7d-4bfc-8db2-507c70fd240f_14
6e78501b808f k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_virt-operator-55989d567c-p5nkl_kubevirt_d407b21d-1eb7-47ea-8bd0-6daec1bbe747_50
436b91002bab k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_virt-handler-7q56b_kubevirt_60b59abb-295d-455f-b8a1-a248a5656d7a_47
0dc549a31e7a 943b496a674d "virt-api --port 844…" 3 minutes ago Up 3 minutes k8s_virt-api_virt-api-5474cf649d-rlcrw_kubevirt_0218bfee-c273-4ed5-8626-43a88d0f5267_8
13a093ee9642 943b496a674d "virt-api --port 844…" 3 minutes ago Up 3 minutes k8s_virt-api_virt-api-5474cf649d-xp658_kubevirt_f0805cfd-2132-4515-acbc-68c967ab2b22_8
2deb25f5272a 8c811b4aec35 "sleep 3600" 3 minutes ago Up 3 minutes k8s_debug_debug_default_8051f632-a8bf-4ca8-801a-4aeca8bcb824_4
d77d27497658 8d147537fb7d "/coredns -conf /etc…" 3 minutes ago Up 3 minutes k8s_coredns_coredns-klpms_kube-system_9c4821ad-f91d-4174-af9a-dfecdbe2321e_5
71ae8a256f11 k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_virt-operator-55989d567c-t2n76_kubevirt_dc3b91e2-3751-48af-ac7c-2e5d060b0349_46
1cb7fbce79cf k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_virt-api-5474cf649d-xp658_kubevirt_f0805cfd-2132-4515-acbc-68c967ab2b22_45
c6c7d9405810 k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_virt-api-5474cf649d-rlcrw_kubevirt_0218bfee-c273-4ed5-8626-43a88d0f5267_46
0fae5f38ff65 k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_virt-controller-7f8ff6cdc4-wcvvb_kubevirt_403a3601-b8e7-4df4-88e1-f93a6a94939c_47
ebd626f49650 k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_debug_default_8051f632-a8bf-4ca8-801a-4aeca8bcb824_25
bf519a40f8ef k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_test_default_962446d6-f99d-41d8-b650-6af03f8a007f_77
ee5297767dd0 k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_coredns-klpms_kube-system_9c4821ad-f91d-4174-af9a-dfecdbe2321e_37
57e2183d1c56 k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_virt-controller-7f8ff6cdc4-hd4ft_kubevirt_1946d9c4-8aa8-498d-b6c0-7fa0812c2da9_51
dddc0e3fdba2 k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_nginx-6799fc88d8-j7fwc_default_13d7e957-bc7d-4bfc-8db2-507c70fd240f_92
bdd0259f6407 k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_yurt-app-manager-6fd8dcd6b4-9gp6n_kube-system_a117d8a4-da40-4568-ba1a-61b1979a76ed_98
83e7610f3578 11ae74319a21 "/opt/bin/flanneld -…" 3 minutes ago Up 3 minutes k8s_kube-flannel_kube-flannel-ds-9mrs8_kube-flannel_2459bd62-295b-4806-a751-ad70a2660c29_16
31ebde671e70 bbad1636b30d "/usr/local/bin/kube…" 3 minutes ago Up 3 minutes k8s_kube-proxy_kube-proxy-9ktvr_kube-system_9404c203-bca0-4598-9aec-6f371e699df4_15
129fcf09327e k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_kube-proxy-9ktvr_kube-system_9404c203-bca0-4598-9aec-6f371e699df4_15
7daf002b5da3 k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_kube-flannel-ds-9mrs8_kube-flannel_2459bd62-295b-4806-a751-ad70a2660c29_15
10f0ab49723e 60fb0e90cdfb "yurthub --v=2 --ser…" 3 minutes ago Up 3 minutes k8s_yurt-hub_yurt-hub-joez-hce-ub20-vm-oykv-w_kube-system_dd10f5ec226508a076ff4cffac748add_15
65f925802c20 k8s.gcr.io/pause:3.5 "/pause" 3 minutes ago Up 3 minutes k8s_POD_yurt-hub-joez-hce-ub20-vm-oykv-w_kube-system_dd10f5ec226508a076ff4cffac748add_15 |
I saw that in yurthub logs
Yurthub got the warning because it can only cache list/watch requests from same component for same resource with different selector. In this case, when Now, lets come to the solution. Firstly, I've to say that it seems cannot be solved just through configuring. We should check if the
If 2 is true, we can change the |
@Congrool Sorry for late, I spent some time to learn the kubevirt code, the conclusion is that, the current implementation of In the normal scenario, you can find logs in
But you can't find it in the disconnected scenario, from the code pkg/virt-handler/vm.go: func (c *VirtualMachineController) Run(threadiness int, stopCh chan struct{}) {
defer c.Queue.ShutDown()
log.Log.Info("Starting virt-handler controller.")
go c.deviceManagerController.Run(stopCh)
cache.WaitForCacheSync(stopCh, c.domainInformer.HasSynced, c.vmiSourceInformer.HasSynced, c.vmiTargetInformer.HasSynced, c.gracefulShutdownInformer.HasSynced)
...
cmd/virt-handler/virt-handler.go
func (app *virtHandlerApp) Run() {
...
vmiSourceInformer := factory.VMISourceHost(app.HostOverride)
vmiTargetInformer := factory.VMITargetHost(app.HostOverride)
...
vmController := virthandler.NewController(
recorder,
app.virtCli,
app.HostOverride,
migrationIpAddress,
app.VirtShareDir,
app.VirtPrivateDir,
vmiSourceInformer,
vmiTargetInformer,
domainSharedInformer,
gracefulShutdownInformer,
...
cache.WaitForCacheSync(stop, vmiSourceInformer.HasSynced, factory.CRD().HasSynced)
go vmController.Run(10, stop) In the disconnected scenario, Both func (f *kubeInformerFactory) VMISourceHost(hostName string) cache.SharedIndexInformer {
labelSelector, err := labels.Parse(fmt.Sprintf(kubev1.NodeNameLabel+" in (%s)", hostName))
if err != nil {
panic(err)
}
return f.getInformer("vmiInformer-sources", func() cache.SharedIndexInformer {
lw := NewListWatchFromClient(f.restClient, "virtualmachineinstances", k8sv1.NamespaceAll, fields.Everything(), labelSelector)
return cache.NewSharedIndexInformer(lw, &kubev1.VirtualMachineInstance{}, f.defaultResync, cache.Indexers{
cache.NamespaceIndex: cache.MetaNamespaceIndexFunc,
"node": func(obj interface{}) (strings []string, e error) {
return []string{obj.(*kubev1.VirtualMachineInstance).Status.NodeName}, nil
},
})
})
}
func (f *kubeInformerFactory) VMITargetHost(hostName string) cache.SharedIndexInformer {
labelSelector, err := labels.Parse(fmt.Sprintf(kubev1.MigrationTargetNodeNameLabel+" in (%s)", hostName))
if err != nil {
panic(err)
}
return f.getInformer("vmiInformer-targets", func() cache.SharedIndexInformer {
lw := NewListWatchFromClient(f.restClient, "virtualmachineinstances", k8sv1.NamespaceAll, fields.Everything(), labelSelector)
return cache.NewSharedIndexInformer(lw, &kubev1.VirtualMachineInstance{}, f.defaultResync, cache.Indexers{
cache.NamespaceIndex: cache.MetaNamespaceIndexFunc,
"node": func(obj interface{}) (strings []string, e error) {
return []string{obj.(*kubev1.VirtualMachineInstance).Status.NodeName}, nil
},
})
})
} Details in the attached logs-openyurt-kubevirt.zip |
@joez Hi, I'm sorry to hear that. If you don't mind modifying the source code, there's still two solutions:
To quickly make it work arround, the option 2 is recommanded. Option 1 is somewhat hardly pushed forward, because we may need a refectoring on yurthub cache framework which is a big job. Anyway, it's the cache limitation that the community have already recognized, we need to come out a final solution to break it. |
@Congrool Let me figure out which way is feasible, I will try option 2 first. This is a big challenge for me, because I have no programming experience for Kubernetes. May I know what is the main reason for the current |
As far as I know, the original thought of yurthub is cache resources as least as possible considering the limited hardware resource of edge nodes. Thus, we separate cache for different components, and only cache some of them by default(e.g. kubelet, flannel, kube-proxy, coredns) which construct the minimal infrastructure set which business bases on(failure recovery, container network, service discovery, dns resolution, respectively). Other components, in this case, virt-handler, was not token into consideration in cloud-edge scenario. But, hm, this feature emerged at the very early stage. Maybe @rambohe-ch can give more details. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
PR linked: #1614 |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@Congrool This proposal didn't move forward well, so pods with two different list requests can not work well with yurthub when cloud-edge network offline currently. I have considered a workaround for Yurthub to support two different list/watch requests:
|
What happened:
This is a use case of using kubevirt + OpenYurt.
On the worker node, we use kubevirt to have a VM deployed , which is connected with master node successfully.
Then we disconnect the network, and reboot the worker node. The problem is that the VM deployed before couln't be started.
What you expected to happen:
The deployed VM could be restarted even on network disconnection.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
We think this could be a cloud edge collaboration issue in OpenYurt's concern. So we hope OpenYurt community could solve this issue. :)
Environment:
kubectl version
): 1.22cat /etc/os-release
): N/Auname -a
): N/Aothers
/kind bug
The text was updated successfully, but these errors were encountered: