-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s] Pod metrics is gone when using containerd as runtime #188
Comments
Created a temp image based on #189 NOTE: If you are using bottlerocket on eks, the socket on host is different due to bottlerocket-os/bottlerocket@91810c8 You need to (and only need to) replace the volumes part to pick the right sock on host. (Full snippet is at end of comment). volumes:
# ...
- name: containerdsock
hostPath:
# path: /run/containerd/containerd.sock
# bottlerocket does not mount containerd sock at normal place
# https://github.com/bottlerocket-os/bottlerocket/commit/91810c85b83ff4c3660b496e243ef8b55df0973b
path: /run/dockershim.sock Default containerd pathWhen host (and kubelet) is using apiVersion: apps/v1
kind: DaemonSet
metadata:
name: cloudwatch-agent
namespace: amazon-cloudwatch
spec:
selector:
matchLabels:
name: cloudwatch-agent
template:
metadata:
labels:
name: cloudwatch-agent
spec:
containers:
- name: cloudwatch-agent
image: public.ecr.aws/p5m3p1a7/cwagent-k8s-containerd-pod:0.1
imagePullPolicy: Always
#ports:
# - containerPort: 8125
# hostPort: 8125
# protocol: UDP
resources:
limits:
cpu: 200m
memory: 200Mi
requests:
cpu: 200m
memory: 200Mi
# Please don't change below envs
env:
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: HOST_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: K8S_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: CI_VERSION
value: "k8s/1.3.0"
# Please don't change the mountPath
volumeMounts:
- name: cwagentconfig
mountPath: /etc/cwagentconfig
- name: rootfs
mountPath: /rootfs
readOnly: true
- name: dockersock
mountPath: /var/run/docker.sock
readOnly: true
- name: varlibdocker
mountPath: /var/lib/docker
readOnly: true
- name: containerdsock
mountPath: /run/containerd/containerd.sock
readOnly: true
- name: sys
mountPath: /sys
readOnly: true
- name: devdisk
mountPath: /dev/disk
readOnly: true
volumes:
- name: cwagentconfig
configMap:
name: cwagentconfig
- name: rootfs
hostPath:
path: /
- name: dockersock
hostPath:
path: /var/run/docker.sock
- name: varlibdocker
hostPath:
path: /var/lib/docker
- name: containerdsock
hostPath:
path: /run/containerd/containerd.sock
- name: sys
hostPath:
path: /sys
- name: devdisk
hostPath:
path: /dev/disk/
terminationGracePeriodSeconds: 60
serviceAccountName: cloudwatch-agent Non default containerd pathNOTE: You only need to change the volumes, when mount into cloudwatch agent container, you should still put it at default path. apiVersion: apps/v1
kind: DaemonSet
metadata:
name: cloudwatch-agent
namespace: amazon-cloudwatch
spec:
selector:
matchLabels:
name: cloudwatch-agent
template:
metadata:
labels:
name: cloudwatch-agent
spec:
# aws eks update-kubeconfig --name eks-pod-metric-missing --region us-west-2
containers:
- name: cloudwatch-agent
image: public.ecr.aws/p5m3p1a7/cwagent-k8s-containerd-pod:0.1
imagePullPolicy: Always
#ports:
# - containerPort: 8125
# hostPort: 8125
# protocol: UDP
resources:
limits:
cpu: 200m
memory: 200Mi
requests:
cpu: 200m
memory: 200Mi
# Please don't change below envs
env:
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.hostIP
- name: HOST_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: K8S_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: CI_VERSION
value: "k8s/1.3.0"
# Please don't change the mountPath
volumeMounts:
- name: cwagentconfig
mountPath: /etc/cwagentconfig
- name: rootfs
mountPath: /rootfs
readOnly: true
- name: dockersock
mountPath: /var/run/docker.sock
readOnly: true
- name: varlibdocker
mountPath: /var/lib/docker
readOnly: true
- name: containerdsock
mountPath: /run/containerd/containerd.sock
readOnly: true
- name: sys
mountPath: /sys
readOnly: true
- name: devdisk
mountPath: /dev/disk
readOnly: true
volumes:
- name: cwagentconfig
configMap:
name: cwagentconfig
- name: rootfs
hostPath:
path: /
- name: dockersock
hostPath:
path: /var/run/docker.sock
- name: varlibdocker
hostPath:
path: /var/lib/docker
- name: containerdsock
hostPath:
# path: /run/containerd/containerd.sock
# bottle rocket does not mount containerd sock at normal place
# https://github.com/bottlerocket-os/bottlerocket/commit/91810c85b83ff4c3660b496e243ef8b55df0973b
path: /run/dockershim.sock
- name: sys
hostPath:
path: /sys
- name: devdisk
hostPath:
path: /dev/disk/
terminationGracePeriodSeconds: 60
serviceAccountName: cloudwatch-agent |
Another known issue is because we are using cadvisor, pod level filesystem usage is ignored
func (h *containerdContainerHandler) GetSpec() (info.ContainerSpec, error) {
// TODO: Since we dont collect disk usage stats for containerd, we set hasFilesystem
// to false. Revisit when we support disk usage stats for containerd
hasFilesystem := false
spec, err := common.GetSpec(h.cgroupPaths, h.machineInfoFactory, h.needNet(), hasFilesystem)
spec.Labels = h.labels
spec.Envs = h.envs
spec.Image = h.image
return spec, err
} |
NOTE: container file system usage is not provided after switching to containerd google/cadvisor#2785 Created another issue to track the container filesystem metrics #192 |
Reopen this issue since we are still in the release process, and the official container insight public doc plus sample manifest is not updated yet. |
This needs fixed within the official helm charts for EKS |
@pingleig I have tried applying the fix listed above exactly as is on EKS with the containerd runtime enabled. However, I'm still getting the same error messages: 2021-08-21T00:08:59Z I! [processors.ec2tagger] ec2tagger: Initial retrieval of tags succeded Support for containerd runtime on EKS was added in July when EKS 1.21 was released. |
@fitchtech. The containerd socket on host is in a different path (same as bottlerocket). This is PR for EKS AMI https://github.com/awslabs/amazon-eks-ami/pull/698/files and the config file https://github.com/awslabs/amazon-eks-ami/blob/8450297eb2ef87fe5cbbce52a86ddcdc8b2e6716/files/containerd-config.toml#L1-L6
You can follow non default path in #188 (comment) hostPath:
# path: /run/containerd/containerd.sock
# bottle rocket does not mount containerd sock at normal place
# https://github.com/bottlerocket-os/bottlerocket/commit/91810c85b83ff4c3660b496e243ef8b55df0973b
path: /run/dockershim.sock cc @sethAmazon since both EKS EC2 and Bottlerocket are using |
@pingleig that worked, thank you. One additional change I had to make is to enable hostNetwork, cause the EC2 instances in my EKS 1.21 node group has the Instance MetaData Service (IMDS) restricted per the EKS security best practices . You have to set hostNetwork: true for it to be able to start up. Once I did everything loaded in the ContainerInsights console. With hostNetwork: false I get this
With hostNetwork: true
ec2tagger doesn't like not being able to access the instance metadata service and the containers will restart. Once I set hostNetwork to true I started seeing metrics flow into ContainerInsights. This was even though the DaemonSet is set to a service account that using IAM Roles for Service Accounts (IRSA) with a policy that give it ec2:DescribeVolumes & ec2:DescribeTags Can an update be made that allows this to work without host network enabled on the daemonset? |
Also, the IAM policy document attached to the IRSA role needs allow sts:AssumeRoleWithWebIdentity & sts:AssumeRole resource restricted to the IRSA role ARN or it will throw access denied errors and assume role API call. |
The official EKS helm charts for CloudWatch Metrics should be updated to do this instead of applying manifests so that you can use helm templates to conditionally set those based on values provided. |
This is exported from internal ticket
TL;DR
The latest image is released, if you were using temp image from this comment #188 (comment) please update to the latest tag.
If the error message
W! No pod metric collected, metrics count is still 7 is containerd socket mounted? https://github.com/aws/amazon-cloudwatch-agent/issues/188
leads you to this issue/run/dockershim.sock
instead of/run/containerd/containerd.sock
Background
We were relying on pause container to have
POD
for detecting pod, which is the case for docker but not for containerd containerd/cri#922 (comment)User will not see pod metrics in container insight dashboard and they will find the following log which is introduced in #171
amazon-cloudwatch-agent/plugins/inputs/cadvisor/container_info_processor.go
Line 72 in fbdd619
The root cause is we are expecting
containerName == 'POD'
to mark a path as podamazon-cloudwatch-agent/plugins/inputs/cadvisor/container_info_processor.go
Lines 119 to 126 in fbdd619
Fix
Release
The fix will be included in next release, the release date is not determined (yet).
The text was updated successfully, but these errors were encountered: