Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ERROR] Failed to read node-cache coreFile /etc/coredns/Corefile.base - open /etc/coredns/Corefile.base: no such file or directory #11655

Open
kvishweshwar opened this issue Oct 22, 2024 · 2 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@kvishweshwar
Copy link

kvishweshwar commented Oct 22, 2024

What happened?

node-cache container of nodelocaldns is not coming up. It is looking for a file /etc/coredns/Corefile.base which is missing.
nodelocaldns has configmap and using which it is creating Corefile but nodelocaldns expecting Corefile.base file as well.

nodelocaldns is expecting:
/etc/coredns/Corefile
/etc/coredns/Corefile.base

I was creating required Corefile.base file using new configmap nodelocaldnsbase (which is exact copy of nodelocaldns) was redeploying daemonset "nodelocaldns" but had no luck. So, raising this case.

What did you expect to happen?

nodelocaldns should work seamlessly.

How can we reproduce it (as minimally and precisely as possible)?

Just try to deploy kubernetes v1.30.3 (kubespray v2.26) on Linux 8.8 machines.

OS

Oracle Linux 8.8

Version of Ansible

9.6.0

Version of Python

3.12

Version of Kubespray (commit)

2.26

Network plugin used

calico

Full inventory with variables

[all]
poc-master-01 ansible_host=10.3.0.1 ip=10.3.0.1 etcd_member_name=poc-master-01
poc-master-02 ansible_host=10.3.0.2 ip=10.3.0.2 etcd_member_name=poc-master-02
poc-master-03 ansible_host=10.3.0.3 ip=10.3.0.3 etcd_member_name=poc-master-03
poc-worker-01 ansible_host=10.3.0.4 # ip=10.3.0.4 etcd_member_name=etcd-04
poc-worker-02 ansible_host=10.3.0.5 # ip=10.3.0.5 etcd_member_name=etcd-05
poc-worker-03 ansible_host=10.3.0.6 # ip=10.3.0.6 etcd_member_name=etcd-06

## configure a bastion host if your nodes are not directly reachable

[bastion]

bastion ansible_host=x.x.x.x ansible_user=some_user

[kube_control_plane]
poc-master-01
poc-master-02
poc-master-03

[etcd]
poc-master-01
poc-master-02
poc-master-03

[kube_node]
poc-worker-01
poc-worker-02
poc-worker-03

[calico_rr]

[k8s_cluster:children]
kube_control_plane
kube_node
calico_rr

[k8s_cluster:vars]
ansible_user=ec2-user

Command used to invoke ansible

ansible-playbook -i inventory/mycluster/inventory.ini --become --become-user=root --private-key ~/.ssh/private-key.pem cluster.yml

Output of ansible run

There is no issue in the execution logs (except Modprobe nf_conntrack) but while checking cluster, nodelocaldns is not coming up. while checking logs, it is showing an issue with missing files Corefile.base.

Anything else we need to know

nothing special.

@kvishweshwar kvishweshwar added the kind/bug Categorizes issue or PR as related to a bug. label Oct 22, 2024
@prietus
Copy link

prietus commented Nov 15, 2024

same issue here, it occurs if I restart the cluster

@stibi
Copy link

stibi commented Nov 26, 2024

I just noticed this error, while dealing with a nodelocaldns pod in crashloop state. First I thought the error is the root cause of the problem, but actually it's not.
The real problem was few lines below, something like [FATAL] plugin/loop: Loop (169.254.25.10:42096 -> 169.254.25.10:53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 8227980046591666151.5519681328841792818." (see #9948) ... maybe that's your case too?

After I fixed the problem with the dns loop (I can share details if needed), the nodelocaldns pod started, that error about Corefile.base is still there, but it's not causing any troubles for the pod to start and work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

3 participants