Skip to content

Latest commit

 

History

History
136 lines (104 loc) · 8.14 KB

ecs-metadata.md

File metadata and controls

136 lines (104 loc) · 8.14 KB

ecs-metadata

Monitor Type: ecs-metadata (Source)

Accepts Endpoints: No

Multiple Instances Allowed: Yes

Overview

This monitor reads container stats from a ECS Task Metadata Endpoint version 2.

This currently does not support CPU share/quota metrics.

Configuration

To activate this monitor in the Smart Agent, add the following to your agent config:

monitors:  # All monitor config goes under this key
 - type: ecs-metadata
   ...  # Additional config

For a list of monitor options that are common to all monitors, see Common Configuration.

Config option Required Type Description
enableExtraBlockIOMetrics no bool Whether it will send all extra block IO metrics as well. (default: false)
enableExtraCPUMetrics no bool Whether it will send all extra CPU metrics as well. (default: false)
enableExtraMemoryMetrics no bool Whether it will send all extra memory metrics as well. (default: false)
enableExtraNetworkMetrics no bool Whether it will send all extra network metrics as well. (default: false)
metadataEndpoint no string The URL of the ECS task metadata. Default is http://169.254.170.2/v2/metadata, which is hardcoded by AWS for version 2. (default: http://169.254.170.2/v2/metadata)
statsEndpoint no string The URL of the ECS container stats. Default is http://169.254.170.2/v2/stats, which is hardcoded by AWS for version 2. (default: http://169.254.170.2/v2/stats)
timeoutSeconds no integer The maximum amount of time to wait for API requests (default: 5)
labelsToDimensions no map of strings A mapping of container label names to dimension names. The corresponding label values will become the dimension value for the mapped name. E.g. io.kubernetes.container.name: container_spec_name would result in a dimension called container_spec_name that has the value of the io.kubernetes.container.name container label.
excludedImages no list of strings A list of filters of images to exclude. Supports literals, globs, and regex.

Metrics

These are the metrics available for this monitor. Metrics that are categorized as container/host (default) are in bold and italics in the list below.

Group blkio

All of the following metrics are part of the blkio metric group. All of the non-default metrics below can be turned on by adding blkio to the monitor config option extraGroups:

  • blkio.io_service_bytes_recursive.async (cumulative)
    Volume, in bytes, of asynchronous block I/O
  • blkio.io_service_bytes_recursive.read (cumulative)
    Volume, in bytes, of reads from block devices
  • blkio.io_service_bytes_recursive.sync (cumulative)
    Volume, in bytes, of synchronous block I/O
  • blkio.io_service_bytes_recursive.total (cumulative)
    Total volume, in bytes, of all block I/O
  • blkio.io_service_bytes_recursive.write (cumulative)
    Volume, in bytes, of writes to block devices
  • blkio.io_serviced_recursive.async (cumulative)
    Number of asynchronous block I/O requests
  • blkio.io_serviced_recursive.read (cumulative)
    Number of reads requests from block devices
  • blkio.io_serviced_recursive.sync (cumulative)
    Number of synchronous block I/O requests
  • blkio.io_serviced_recursive.total (cumulative)
    Total number of block I/O requests
  • blkio.io_serviced_recursive.write (cumulative)
    Number of write requests to block devices

Group cpu

All of the following metrics are part of the cpu metric group. All of the non-default metrics below can be turned on by adding cpu to the monitor config option extraGroups:

  • cpu.limit (gauge)
    CPU usage limit of the container, in ECS vCPU units
  • cpu.percent (gauge)
    Percentage of host CPU resources used by the container
  • cpu.percpu.usage (cumulative)
    Jiffies of CPU time spent by the container, per CPU core
  • cpu.throttling_data.periods (cumulative)
    Number of periods
  • cpu.throttling_data.throttled_periods (cumulative)
    Number of periods throttled
  • cpu.throttling_data.throttled_time (cumulative)
    Throttling time in nano seconds
  • cpu.usage.kernelmode (cumulative)
    Jiffies of CPU time spent in kernel mode by the container
  • cpu.usage.system (cumulative)
    Jiffies of CPU time used by the system
  • cpu.usage.total (cumulative)
    Jiffies of CPU time used by the container
  • cpu.usage.usermode (cumulative)
    Jiffies of CPU time spent in user mode by the container

Group memory

All of the following metrics are part of the memory metric group. All of the non-default metrics below can be turned on by adding memory to the monitor config option extraGroups:

  • memory.percent (gauge)
    Percent of memory (0-100) used by the container relative to its limit (excludes page cache usage)
  • memory.stats.swap (gauge)
    Bytes of swap memory used by container
  • memory.usage.limit (gauge)
    Memory usage limit of the container, in bytes
  • memory.usage.max (gauge)
    Maximum measured memory usage of the container, in bytes
  • memory.usage.total (gauge)
    Bytes of memory used by the container

Group network

All of the following metrics are part of the network metric group. All of the non-default metrics below can be turned on by adding network to the monitor config option extraGroups:

  • network.usage.rx_bytes (cumulative)
    Bytes received by the container via its network interface
  • network.usage.rx_dropped (cumulative)
    Number of inbound network packets dropped by the container
  • network.usage.rx_errors (cumulative)
    Errors receiving network packets
  • network.usage.rx_packets (cumulative)
    Network packets received by the container via its network interface
  • network.usage.tx_bytes (cumulative)
    Bytes sent by the container via its network interface
  • network.usage.tx_dropped (cumulative)
    Number of outbound network packets dropped by the container
  • network.usage.tx_errors (cumulative)
    Errors sending network packets
  • network.usage.tx_packets (cumulative)
    Network packets sent by the container via its network interface

Non-default metrics (version 4.7.0+)

The following information applies to the agent version 4.7.0+ that has enableBuiltInFiltering: true set on the top level of the agent config.

To emit metrics that are not default, you can add those metrics in the generic monitor-level extraMetrics config option. Metrics that are derived from specific configuration options that do not appear in the above list of metrics do not need to be added to extraMetrics.

To see a list of metrics that will be emitted you can run agent-status monitors after configuring this monitor in a running agent instance.

Legacy non-default metrics (version < 4.7.0)

The following information only applies to agent version older than 4.7.0. If you have a newer agent and have set enableBuiltInFiltering: true at the top level of your agent config, see the section above. See upgrade instructions in Old-style whitelist filtering.

If you have a reference to the whitelist.json in your agent's top-level metricsToExclude config option, and you want to emit metrics that are not in that whitelist, then you need to add an item to the top-level metricsToInclude config option to override that whitelist (see Inclusion filtering. Or you can just copy the whitelist.json, modify it, and reference that in metricsToExclude.