Skip to content

Latest commit

 

History

History
145 lines (113 loc) · 9.12 KB

cadvisor.md

File metadata and controls

145 lines (113 loc) · 9.12 KB

cadvisor

Monitor Type: cadvisor (Source)

Accepts Endpoints: No

Multiple Instances Allowed: Yes

Overview

This monitor pulls metrics directly from cadvisor, which conventionally runs on port 4194, but can be configured to anything. If you are running on Kubernetes, consider the kubelet-stats monitor because many K8s nodes do not expose cAdvisor on a network port, even though they are running it within Kubelet.

If you are running containers with Docker, there is a fair amount of duplication with the collectd/docker monitor in terms of the metrics sent (under distinct metric names) so you may want to consider not enabling the Docker monitor in a K8s environment, or else use filtering to whitelist only certain metrics. Note that this will cause the built-in Docker dashboards to be blank, but container metrics will be available on the Kubernetes dashboards instead.

Configuration

To activate this monitor in the Smart Agent, add the following to your agent config:

monitors:  # All monitor config goes under this key
 - type: cadvisor
   ...  # Additional config

For a list of monitor options that are common to all monitors, see Common Configuration.

Config option Required Type Description
cadvisorURL no string Where to find cAdvisor (default: http://localhost:4194)

Metrics

These are the metrics available for this monitor. Metrics that are categorized as container/host (default) are in bold and italics in the list below.

  • container_cpu_cfs_periods (cumulative)
    Total number of elapsed CFS enforcement intervals
  • container_cpu_cfs_throttled_periods (cumulative)
    Total number of times tasks in the cgroup have been throttled
  • container_cpu_cfs_throttled_time (cumulative)
    Total time duration, in nanoseconds, for which tasks in the cgroup have been throttled
  • container_cpu_percent (cumulative)
    Cumulative cpu utilization as a percentage of the total host CPU available. This metric is equivalent to container_cpu_utilization / <# of CPUs/cores on host>.
  • container_cpu_system_seconds_total (cumulative)
    Cumulative system cpu time consumed in nanoseconds
  • container_cpu_usage_seconds_total (cumulative)
    Cumulative cpu time consumed per cpu in nanoseconds
  • container_cpu_user_seconds_total (cumulative)
    Cumulative user cpu time consumed in nanoseconds
  • container_cpu_utilization (cumulative)
    Cumulative cpu utilization in percentages. This is equivalent to "centicores", or hundreths of CPU cores consumed. This metric is NOT normalized by the total # of cores on the system.
  • container_cpu_utilization_per_core (cumulative)
    Cumulative cpu utilization in percentages per core
  • container_fs_io_current (gauge)
    Number of I/Os currently in progress
  • container_fs_io_time_seconds_total (cumulative)
    Cumulative count of seconds spent doing I/Os
  • container_fs_io_time_weighted_seconds_total (cumulative)
    Cumulative weighted I/O time in seconds
  • container_fs_limit_bytes (gauge)
    Number of bytes that the container may occupy on this filesystem
  • container_fs_read_seconds_total (cumulative)
    Cumulative count of seconds spent reading
  • container_fs_reads_merged_total (cumulative)
    Cumulative count of reads merged
  • container_fs_reads_total (cumulative)
    Cumulative count of reads completed
  • container_fs_sector_reads_total (cumulative)
    Cumulative count of sector reads completed
  • container_fs_sector_writes_total (cumulative)
    Cumulative count of sector writes completed
  • container_fs_usage_bytes (gauge)
    Number of bytes that are consumed by the container on this filesystem
  • container_fs_write_seconds_total (cumulative)
    Cumulative count of seconds spent writing
  • container_fs_writes_merged_total (cumulative)
    Cumulative count of writes merged
  • container_fs_writes_total (cumulative)
    Cumulative count of writes completed
  • container_last_seen (gauge)
    Last time a container was seen by the exporter
  • container_memory_failcnt (cumulative)
    Number of memory usage hits limits
  • container_memory_failures_total (cumulative)
    Cumulative count of memory allocation failures
  • container_memory_usage_bytes (gauge)
    Current memory usage in bytes
  • container_memory_working_set_bytes (gauge)
    Current working set in bytes
  • container_spec_cpu_period (gauge)
    The number of microseconds that the CFS scheduler uses as a window when limiting container processes
  • container_spec_cpu_quota (gauge)
    In CPU quota for the CFS process scheduler. In K8s this is equal to the containers's CPU limit as a fraction of 1 core and multiplied by the container_spec_cpu_period. So if the CPU limit is 500m (500 millicores) for a container and the container_spec_cpu_period is set to 100,000, this value will be 50,000.
  • container_spec_cpu_shares (gauge)
    CPU share of the container
  • container_spec_memory_limit_bytes (gauge)
    Memory limit for the container.
  • container_spec_memory_swap_limit_bytes (gauge)
    Memory swap limit for the container.
  • container_start_time_seconds (gauge)
    Start time of the container since unix epoch in seconds.
  • container_tasks_state (gauge)
    Number of tasks in given state
  • machine_cpu_cores (gauge)
    Number of CPU cores on the node.
  • machine_cpu_frequency_khz (gauge)
    Node's CPU frequency.
  • machine_memory_bytes (gauge)
    Amount of memory installed on the node.
  • pod_network_receive_bytes_total (cumulative)
    Cumulative count of bytes received
  • pod_network_receive_errors_total (cumulative)
    Cumulative count of errors encountered while receiving
  • pod_network_receive_packets_dropped_total (cumulative)
    Cumulative count of packets dropped while receiving
  • pod_network_receive_packets_total (cumulative)
    Cumulative count of packets received
  • pod_network_transmit_bytes_total (cumulative)
    Cumulative count of bytes transmitted
  • pod_network_transmit_errors_total (cumulative)
    Cumulative count of errors encountered while transmitting
  • pod_network_transmit_packets_dropped_total (cumulative)
    Cumulative count of packets dropped while transmitting
  • pod_network_transmit_packets_total (cumulative)
    Cumulative count of packets transmitted

Non-default metrics (version 4.7.0+)

The following information applies to the agent version 4.7.0+ that has enableBuiltInFiltering: true set on the top level of the agent config.

To emit metrics that are not default, you can add those metrics in the generic monitor-level extraMetrics config option. Metrics that are derived from specific configuration options that do not appear in the above list of metrics do not need to be added to extraMetrics.

To see a list of metrics that will be emitted you can run agent-status monitors after configuring this monitor in a running agent instance.

Legacy non-default metrics (version < 4.7.0)

The following information only applies to agent version older than 4.7.0. If you have a newer agent and have set enableBuiltInFiltering: true at the top level of your agent config, see the section above. See upgrade instructions in Old-style whitelist filtering.

If you have a reference to the whitelist.json in your agent's top-level metricsToExclude config option, and you want to emit metrics that are not in that whitelist, then you need to add an item to the top-level metricsToInclude config option to override that whitelist (see Inclusion filtering. Or you can just copy the whitelist.json, modify it, and reference that in metricsToExclude.

Dimensions

The following dimensions may occur on metrics emitted by this monitor. Some dimensions may be specific to certain metrics.

Name Description
container_id The ID of the running container
container_image The container image name
container_name The container's name as it appears in the pod spec, the same as container_spec_name but retained for backwards compatibility.
container_spec_name The container's name as it appears in the pod spec
kubernetes_namespace The K8s namespace the container is part of
kubernetes_pod_name The pod instance under which this container runs
kubernetes_pod_uid The UID of the pod instance under which this container runs