Skip to content

Latest commit

 

History

History
182 lines (140 loc) · 11.6 KB

traefik.md

File metadata and controls

182 lines (140 loc) · 11.6 KB

traefik

Monitor Type: traefik (Source)

Accepts Endpoints: Yes

Multiple Instances Allowed: Yes

Overview

Traefik is an open-source HTTP reverse proxy and load balancer. Traefik exports Prometheus metrics that can be scraped by the SignalFx Smart Agent. These metrics can be categorized into Traefik-related, entrypoint-related and backend-related metrics. The Traefik-related metrics are prefixed by go_ and process_. The entrypoint-related metrics are prefixed by traefik_entrypoint_ and the backend-related metrics prefixed by traefik_backend_.

The Traefik-related metrics are for monitoring Traefik itself. For instance, the go_memstats_sys_bytes metric can be used to plot Traefik memory usage. The entrypoint-related and backend-related metrics are the number and duration of requests measured at entrypoints and backends. These metrics are used to compute measurements such as the average request duration.

Requirements and Dependencies

Software Version
signalfx-agent 4.7.0+

Traefik Configuration

Edit the Traefik configuration file, typically traefik.toml, to enable Traefik to expose prometheus metrics at an endpoint. The endpoint is on path /metrics by default. When running the Traefik binary, the configuration file is typically passed in as a command line argument. For example,

./traefik -c traefik.toml

However, when running the Traefik Docker image, the configuration file is mounted to volume /etc/traefik/traefik.toml. For example,

docker run -d -p 8080:8080 -p 80:80 -v $PWD/traefik.toml:/etc/traefik/traefik.toml

If the Traefik configuration file is not available use the sample configuration file here to get started.

See here for complete Traefik docs.

Smart Agent Configuration

SignalFx Smart Agent docs can be found here. Choose deployment specific configuration instruction here. The SignalFx Smart Agent must have network access to Traefik.

Below is an example configuration that enables the traefik monitor. For the given configuration below, the monitor will scrape Prometheus metrics in the default path /metrics on port 8080, add dimension metric_source=traefik to the metrics and export them to SignalFx.

monitors:
- type: traefik
  discoveryRule: port == 8080
  extraDimensions:
    metric_source: traefik

Configuration

To activate this monitor in the Smart Agent, add the following to your agent config:

monitors:  # All monitor config goes under this key
 - type: traefik
   ...  # Additional config

For a list of monitor options that are common to all monitors, see Common Configuration.

Config option Required Type Description
host yes string Host of the exporter
port yes integer Port of the exporter
username no string Basic Auth username to use on each request, if any.
password no string Basic Auth password to use on each request, if any.
useHTTPS no bool If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (default: false)
skipVerify no bool If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (default: false)
metricPath no string Path to the metrics endpoint on the exporter server, usually /metrics (the default). (default: /metrics)
sendAllMetrics no bool Send all the metrics that come out of the Prometheus exporter without any filtering. This option has no effect when using the prometheus exporter monitor directly since there is no built-in filtering, only when embedding it in other monitors. (default: false)

Metrics

These are the metrics available for this monitor. Metrics that are categorized as container/host (default) are in bold and italics in the list below.

  • go_gc_duration_seconds (cumulative)
    A summary of the GC invocation durations.
  • go_gc_duration_seconds_count (cumulative)
    A count of the GC invocation durations.
  • go_gc_duration_seconds_sum (cumulative)
    The sum of the GC invocation durations, in seconds.
  • go_goroutines (cumulative)
    Number of goroutines that currently exist.
  • go_memstats_alloc_bytes (gauge)
    Number of bytes allocated and still in use.
  • go_memstats_alloc_bytes_total (cumulative)
    Total number of bytes allocated, even if freed.
  • go_memstats_buck_hash_sys_bytes (gauge)
    Number of bytes used by the profiling bucket hash table.
  • go_memstats_frees_total (cumulative)
    Total number of frees.
  • go_memstats_gc_cpu_fraction (gauge)
    The fraction of this program's available CPU time used by the GC since the program started.
  • go_memstats_gc_sys_bytes (gauge)
    Number of bytes used for garbage collection system metadata.
  • go_memstats_heap_alloc_bytes (gauge)
    Number of heap bytes allocated and still in use.
  • go_memstats_heap_idle_bytes (gauge)
    Number of heap bytes waiting to be used.
  • go_memstats_heap_inuse_bytes (gauge)
    Number of heap bytes that are in use.
  • go_memstats_heap_objects (gauge)
    Number of allocated objects.
  • go_memstats_heap_released_bytes (gauge)
    Number of heap bytes released to OS.
  • go_memstats_heap_sys_bytes (gauge)
    Number of heap bytes obtained from system.
  • go_memstats_last_gc_time_seconds (gauge)
    Length of time since last garbage collection, in seconds since unix epoch.
  • go_memstats_lookups_total (cumulative)
    Total number of pointer lookups.
  • go_memstats_mallocs_total (cumulative)
    Total number of mallocs.
  • go_memstats_mcache_inuse_bytes (gauge)
    Number of bytes in use by mcache structures.
  • go_memstats_mcache_sys_bytes (gauge)
    Number of bytes used for mcache structures obtained from system.
  • go_memstats_mspan_inuse_bytes (gauge)
    Number of bytes in use by mspan structures.
  • go_memstats_mspan_sys_bytes (gauge)
    Number of bytes used for mspan structures obtained from system.
  • go_memstats_next_gc_bytes (gauge)
    Number of heap bytes when next garbage collection will take place.
  • go_memstats_other_sys_bytes (gauge)
    Number of bytes used for other system allocations.
  • go_memstats_stack_inuse_bytes (gauge)
    Number of bytes in use by the stack allocator.
  • go_memstats_stack_sys_bytes (gauge)
    Number of bytes obtained from system for stack allocator.
  • go_memstats_sys_bytes (gauge)
    Number of bytes obtained from system.
  • go_threads (gauge)
    Number of OS threads created
  • process_cpu_seconds_total (cumulative)
    Total user and system CPU time spent, in seconds.
  • process_max_fds (gauge)
    Maximum number of open file descriptors.
  • process_open_fds (gauge)
    Number of open file descriptors.
  • process_resident_memory_bytes (gauge)
    Resident memory size in bytes.
  • process_start_time_seconds (gauge)
    Start time of the process since unix epoch in seconds.
  • process_virtual_memory_bytes (gauge)
    Virtual memory size in bytes.
  • traefik_backend_open_connections (gauge)
    How many open connections exist on a backend, partitioned by method and protocol.
  • traefik_backend_request_duration_seconds_bucket (cumulative)
    The sum of request durations that are within a configured time interval. The request durations are measured at a backend in seconds. This value is partitioned by status code, protocol, and method.
  • traefik_backend_request_duration_seconds_count (cumulative)
    The number of request durations that were measured on a backend. The values are partitioned by status code, protocol, and method.
  • traefik_backend_request_duration_seconds_sum (cumulative)
    The sum of the request durations in seconds, measured on a backend, partitioned by status code, protocol, and method.
  • traefik_backend_requests_total (cumulative)
    How many HTTP requests were processed on a backend, partitioned by status code, protocol, and method.
  • traefik_backend_server_up (gauge)
    Backend server is up, described by gauge value of 0 (down) or 1 (up).
  • traefik_config_last_reload_failure (gauge)
    Last config reload failure
  • traefik_config_last_reload_success (gauge)
    Last config reload success
  • traefik_config_reloads_failure_total (cumulative)
    Total number of config reloads that failed
  • traefik_config_reloads_total (cumulative)
    Config reloads
  • traefik_entrypoint_open_connections (gauge)
    How many open connections exist on an entrypoint, partitioned by method and protocol.
  • traefik_entrypoint_request_duration_seconds_bucket (cumulative)
    The sum of request durations that are within a configured time interval. The request durations are measured at an entrypoint in seconds. This value is partitioned by status code, protocol, and method.
  • traefik_entrypoint_request_duration_seconds_count (cumulative)
    The number of request durations that were measured on an entrypoint. The values are partitioned by status code, protocol, and method.
  • traefik_entrypoint_request_duration_seconds_sum (cumulative)
    The sum of the request durations in seconds measured on an entrypoint, partitioned by status code, protocol, and method.
  • traefik_entrypoint_requests_total (cumulative)
    How many HTTP requests processed on an entrypoint, partitioned by status code, protocol, and method.

Non-default metrics (version 4.7.0+)

The following information applies to the agent version 4.7.0+ that has enableBuiltInFiltering: true set on the top level of the agent config.

To emit metrics that are not default, you can add those metrics in the generic monitor-level extraMetrics config option. Metrics that are derived from specific configuration options that do not appear in the above list of metrics do not need to be added to extraMetrics.

To see a list of metrics that will be emitted you can run agent-status monitors after configuring this monitor in a running agent instance.

Legacy non-default metrics (version < 4.7.0)

The following information only applies to agent version older than 4.7.0. If you have a newer agent and have set enableBuiltInFiltering: true at the top level of your agent config, see the section above. See upgrade instructions in Old-style whitelist filtering.

If you have a reference to the whitelist.json in your agent's top-level metricsToExclude config option, and you want to emit metrics that are not in that whitelist, then you need to add an item to the top-level metricsToInclude config option to override that whitelist (see Inclusion filtering. Or you can just copy the whitelist.json, modify it, and reference that in metricsToExclude.