Monitor Type: kubelet-stats
(Source)
Accepts Endpoints: No
Multiple Instances Allowed: Yes
This monitor pulls cadvisor metrics through a
Kubernetes kubelet instance via the /stats/container
endpoint.
To activate this monitor in the Smart Agent, add the following to your agent config:
monitors: # All monitor config goes under this key
- type: kubelet-stats
... # Additional config
For a list of monitor options that are common to all monitors, see Common Configuration.
Config option | Required | Type | Description |
---|---|---|---|
kubeletAPI |
no | object (see below) |
Kubelet client configuration |
The nested kubeletAPI
config object has the following fields:
Config option | Required | Type | Description |
---|---|---|---|
url |
no | string |
URL of the Kubelet instance. This will default to https://<current node hostname>:10250 if not provided. |
authType |
no | string |
Can be none for no auth, tls for TLS client cert auth, or serviceAccount to use the pod's default service account token to authenticate. (default: none ) |
skipVerify |
no | bool |
Whether to skip verification of the Kubelet's TLS cert (default: true ) |
caCertPath |
no | string |
Path to the CA cert that has signed the Kubelet's TLS cert, unnecessary if skipVerify is set to false. |
clientCertPath |
no | string |
Path to the client TLS cert to use if authType is set to tls |
clientKeyPath |
no | string |
Path to the client TLS key to use if authType is set to tls |
logResponses |
no | bool |
Whether to log the raw cadvisor response at the debug level for debugging purposes. (default: false ) |
These are the metrics available for this monitor. Metrics that are categorized as container/host (default) are in bold and italics in the list below.
container_cpu_cfs_periods
(cumulative)
Total number of elapsed CFS enforcement intervalscontainer_cpu_cfs_throttled_periods
(cumulative)
Total number of times tasks in the cgroup have been throttledcontainer_cpu_cfs_throttled_time
(cumulative)
Total time duration, in nanoseconds, for which tasks in the cgroup have been throttledcontainer_cpu_percent
(cumulative)
Cumulative cpu utilization as a percentage of the total host CPU available. This metric is equivalent tocontainer_cpu_utilization
/ <# of CPUs/cores on host>.container_cpu_system_seconds_total
(cumulative)
Cumulative system cpu time consumed in nanosecondscontainer_cpu_usage_seconds_total
(cumulative)
Cumulative cpu time consumed per cpu in nanosecondscontainer_cpu_user_seconds_total
(cumulative)
Cumulative user cpu time consumed in nanosecondscontainer_cpu_utilization
(cumulative)
Cumulative cpu utilization in percentages. This is equivalent to "centicores", or hundreths of CPU cores consumed. This metric is NOT normalized by the total # of cores on the system.container_cpu_utilization_per_core
(cumulative)
Cumulative cpu utilization in percentages per corecontainer_fs_io_current
(gauge)
Number of I/Os currently in progresscontainer_fs_io_time_seconds_total
(cumulative)
Cumulative count of seconds spent doing I/Oscontainer_fs_io_time_weighted_seconds_total
(cumulative)
Cumulative weighted I/O time in secondscontainer_fs_limit_bytes
(gauge)
Number of bytes that the container may occupy on this filesystemcontainer_fs_read_seconds_total
(cumulative)
Cumulative count of seconds spent readingcontainer_fs_reads_merged_total
(cumulative)
Cumulative count of reads mergedcontainer_fs_reads_total
(cumulative)
Cumulative count of reads completedcontainer_fs_sector_reads_total
(cumulative)
Cumulative count of sector reads completedcontainer_fs_sector_writes_total
(cumulative)
Cumulative count of sector writes completedcontainer_fs_usage_bytes
(gauge)
Number of bytes that are consumed by the container on this filesystemcontainer_fs_write_seconds_total
(cumulative)
Cumulative count of seconds spent writingcontainer_fs_writes_merged_total
(cumulative)
Cumulative count of writes mergedcontainer_fs_writes_total
(cumulative)
Cumulative count of writes completedcontainer_last_seen
(gauge)
Last time a container was seen by the exportercontainer_memory_failcnt
(cumulative)
Number of memory usage hits limitscontainer_memory_failures_total
(cumulative)
Cumulative count of memory allocation failurescontainer_memory_usage_bytes
(gauge)
Current memory usage in bytescontainer_memory_working_set_bytes
(gauge)
Current working set in bytescontainer_spec_cpu_period
(gauge)
The number of microseconds that the CFS scheduler uses as a window when limiting container processescontainer_spec_cpu_quota
(gauge)
In CPU quota for the CFS process scheduler. In K8s this is equal to the containers's CPU limit as a fraction of 1 core and multiplied by thecontainer_spec_cpu_period
. So if the CPU limit is500m
(500 millicores) for a container and thecontainer_spec_cpu_period
is set to 100,000, this value will be 50,000.container_spec_cpu_shares
(gauge)
CPU share of the containercontainer_spec_memory_limit_bytes
(gauge)
Memory limit for the container.container_spec_memory_swap_limit_bytes
(gauge)
Memory swap limit for the container.container_start_time_seconds
(gauge)
Start time of the container since unix epoch in seconds.container_tasks_state
(gauge)
Number of tasks in given statemachine_cpu_cores
(gauge)
Number of CPU cores on the node.machine_cpu_frequency_khz
(gauge)
Node's CPU frequency.machine_memory_bytes
(gauge)
Amount of memory installed on the node.pod_network_receive_bytes_total
(cumulative)
Cumulative count of bytes receivedpod_network_receive_errors_total
(cumulative)
Cumulative count of errors encountered while receivingpod_network_receive_packets_dropped_total
(cumulative)
Cumulative count of packets dropped while receivingpod_network_receive_packets_total
(cumulative)
Cumulative count of packets receivedpod_network_transmit_bytes_total
(cumulative)
Cumulative count of bytes transmittedpod_network_transmit_errors_total
(cumulative)
Cumulative count of errors encountered while transmittingpod_network_transmit_packets_dropped_total
(cumulative)
Cumulative count of packets dropped while transmittingpod_network_transmit_packets_total
(cumulative)
Cumulative count of packets transmitted
The following information applies to the agent version 4.7.0+ that has
enableBuiltInFiltering: true
set on the top level of the agent config.
To emit metrics that are not default, you can add those metrics in the
generic monitor-level extraMetrics
config option. Metrics that are derived
from specific configuration options that do not appear in the above list of
metrics do not need to be added to extraMetrics
.
To see a list of metrics that will be emitted you can run agent-status monitors
after configuring this monitor in a running agent instance.
The following information only applies to agent version older than 4.7.0. If
you have a newer agent and have set enableBuiltInFiltering: true
at the top
level of your agent config, see the section above. See upgrade instructions in
Old-style whitelist filtering.
If you have a reference to the whitelist.json
in your agent's top-level
metricsToExclude
config option, and you want to emit metrics that are not in
that whitelist, then you need to add an item to the top-level
metricsToInclude
config option to override that whitelist (see Inclusion
filtering. Or you can just
copy the whitelist.json, modify it, and reference that in metricsToExclude
.
The following dimensions may occur on metrics emitted by this monitor. Some dimensions may be specific to certain metrics.
Name | Description |
---|---|
container_id |
The ID of the running container |
container_image |
The container image name |
container_name |
The container's name as it appears in the pod spec, the same as container_spec_name but retained for backwards compatibility. |
container_spec_name |
The container's name as it appears in the pod spec |
kubernetes_namespace |
The K8s namespace the container is part of |
kubernetes_pod_name |
The pod instance under which this container runs |
kubernetes_pod_uid |
The UID of the pod instance under which this container runs |