diff --git a/activemq/SMART_AGENT_MONITOR.md b/activemq/SMART_AGENT_MONITOR.md index 44de036db..a4bc3fe43 100644 --- a/activemq/SMART_AGENT_MONITOR.md +++ b/activemq/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/activemq`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/activemq`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -55,7 +55,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -166,15 +166,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/cassandra/SMART_AGENT_MONITOR.md b/cassandra/SMART_AGENT_MONITOR.md index 4a4460ef2..149bf3041 100644 --- a/cassandra/SMART_AGENT_MONITOR.md +++ b/cassandra/SMART_AGENT_MONITOR.md @@ -12,15 +12,15 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/cassandra`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/cassandra`. +Below is an overview of that monitor. ### Smart Agent Monitor Monitors Cassandra using the Collectd GenericJMX plugin. This is essentially a wrapper around the -[collectd-genericjmx](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./collectd-genericjmx.md)[](sfx_link:java) monitor that comes with a +[collectd-genericjmx](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./collectd-genericjmx.md)[](sfx_link:java) monitor that comes with a set of predefined MBean definitions that a standard Cassandra deployment will expose. @@ -47,7 +47,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -97,6 +97,18 @@ Metrics that are categorized as These are the metrics available for this integration. + - `counter.cassandra.ClientRequest.CASRead.Latency.Count` (*cumulative*)
Count of transactional read operations since server start. + - `counter.cassandra.ClientRequest.CASRead.TotalLatency.Count` (*cumulative*)
The total number of microseconds elapsed in servicing client transactional read requests. + + It can be devided by `counter.cassandra.ClientRequest.CASRead.Latency.Count` + to find the real time transactional read latency. + + - `counter.cassandra.ClientRequest.CASWrite.Latency.Count` (*cumulative*)
Count of transactional write operations since server start. + - `counter.cassandra.ClientRequest.CASWrite.TotalLatency.Count` (*cumulative*)
The total number of microseconds elapsed in servicing client transactional write requests. + + It can be devided by `counter.cassandra.ClientRequest.CASWrite.Latency.Count` + to find the real time transactional write latency. + - ***`counter.cassandra.ClientRequest.RangeSlice.Latency.Count`*** (*cumulative*)
Count of range slice operations since server start. This typically indicates a server overload condition. If this value is increasing across the cluster then the cluster is too small for the application range slice load. @@ -114,6 +126,7 @@ These are the metrics available for this integration. - one or more clients are directing more load to this server than the others - the server is experiencing hardware or software issues and may require maintenance. + - `counter.cassandra.ClientRequest.RangeSlice.TotalLatency.Count` (*cumulative*)
The total number of microseconds elapsed in servicing range slice requests. - ***`counter.cassandra.ClientRequest.RangeSlice.Unavailables.Count`*** (*cumulative*)
Count of range slice unavailables since server start. A non-zero value means that insufficient replicas were available to fulfil a range slice request at the requested consistency level. @@ -121,7 +134,7 @@ These are the metrics available for this integration. This typically means that one or more nodes are down. To fix this condition, any down nodes must be restarted, or removed from the cluster. - - ***`counter.cassandra.ClientRequest.Read.Latency.Count`*** (*cumulative*)
Count of read operations since server start + - ***`counter.cassandra.ClientRequest.Read.Latency.Count`*** (*cumulative*)
Count of read operations since server start. - ***`counter.cassandra.ClientRequest.Read.Timeouts.Count`*** (*cumulative*)
Count of read timeouts since server start. This typically indicates a server overload condition. If this value is increasing across the cluster then the cluster is too small for the application read load. @@ -130,6 +143,11 @@ These are the metrics available for this integration. - one or more clients are directing more load to this server than the others - the server is experiencing hardware or software issues and may require maintenance. + - `counter.cassandra.ClientRequest.Read.TotalLatency.Count` (*cumulative*)
The total number of microseconds elapsed in servicing client read requests. + + It can be devided by `counter.cassandra.ClientRequest.Read.Latency.Count` + to find the real time read latency. + - ***`counter.cassandra.ClientRequest.Read.Unavailables.Count`*** (*cumulative*)
Count of read unavailables since server start. A non-zero value means that insufficient replicas were available to fulfil a read request at the requested consistency level. This typically means that one or more @@ -145,6 +163,11 @@ These are the metrics available for this integration. - one or more clients are directing more load to this server than the others - the server is experiencing hardware or software issues and may require maintenance. + - `counter.cassandra.ClientRequest.Write.TotalLatency.Count` (*cumulative*)
The total number of microseconds elapsed in servicing client write requests. + + It can be devided by `counter.cassandra.ClientRequest.Write.Latency.Count` + to find the real time write latency. + - ***`counter.cassandra.ClientRequest.Write.Unavailables.Count`*** (*cumulative*)
Count of write unavailables since server start. A non-zero value means that insufficient replicas were available to fulfil a write request at the requested consistency level. @@ -157,6 +180,34 @@ These are the metrics available for this integration. not increase steadily over time then the node may be experiencing problems completing compaction operations. + - `counter.cassandra.Storage.Exceptions.Count` (*cumulative*)
Number of internal exceptions caught. Under normal exceptions this should be zero. + + - ***`counter.cassandra.Storage.Load.Count`*** (*cumulative*)
Storage used for Cassandra data in bytes. Use this metric to see how much storage is being used for data by a Cassandra node. + + The value of this metric is influenced by: + - Total data stored into the database + - compaction behavior + + - `counter.cassandra.Storage.TotalHints.Count` (*cumulative*)
Total hints since node start. Indicates that write operations cannot be + delivered to a node, usually because a node is down. If this value is + increasing and all nodes are up then there may be some connectivity + issue between nodes in the cluster. + + - ***`counter.cassandra.Storage.TotalHintsInProgress.Count`*** (*cumulative*)
Total pending hints. Indicates that write operations cannot be + delivered to a node, usually because a node is down. If this value is + increasing and all nodes are up then there may be some connectivity + issue between nodes in the cluster. + + - `gauge.cassandra.ClientRequest.CASRead.Latency.50thPercentile` (*gauge*)
50th percentile (median) of Cassandra transactional read latency. + + - `gauge.cassandra.ClientRequest.CASRead.Latency.99thPercentile` (*gauge*)
99th percentile of Cassandra transactional read latency. + + - `gauge.cassandra.ClientRequest.CASRead.Latency.Max` (*gauge*)
Maximum Cassandra transactional read latency. + - `gauge.cassandra.ClientRequest.CASWrite.Latency.50thPercentile` (*gauge*)
50th percentile (median) of Cassandra transactional write latency. + + - `gauge.cassandra.ClientRequest.CASWrite.Latency.99thPercentile` (*gauge*)
99th percentile of Cassandra transactional write latency. + + - `gauge.cassandra.ClientRequest.CASWrite.Latency.Max` (*gauge*)
Maximum Cassandra transactional write latency. - `gauge.cassandra.ClientRequest.RangeSlice.Latency.50thPercentile` (*gauge*)
50th percentile (median) of Cassandra range slice latency. This value should be similar across all nodes in the cluster. If some nodes have higher values than the rest of the cluster then they may have more connected clients @@ -167,7 +218,7 @@ These are the metrics available for this integration. the rest of the cluster then they may have more connected clients or may be experiencing heavier than usual compaction load. - - `gauge.cassandra.ClientRequest.RangeSlice.Latency.Max` (*gauge*)
Maximum Cassandra range slice latency + - `gauge.cassandra.ClientRequest.RangeSlice.Latency.Max` (*gauge*)
Maximum Cassandra range slice latency. - ***`gauge.cassandra.ClientRequest.Read.Latency.50thPercentile`*** (*gauge*)
50th percentile (median) of Cassandra read latency. This value should be similar across all nodes in the cluster. If some nodes have higher values than the rest of the cluster then they may have more connected @@ -178,7 +229,7 @@ These are the metrics available for this integration. the rest of the cluster then they may have more connected clients or may be experiencing heavier than usual compaction load. - - ***`gauge.cassandra.ClientRequest.Read.Latency.Max`*** (*gauge*)
Maximum Cassandra read latency + - ***`gauge.cassandra.ClientRequest.Read.Latency.Max`*** (*gauge*)
Maximum Cassandra read latency. - ***`gauge.cassandra.ClientRequest.Write.Latency.50thPercentile`*** (*gauge*)
50th percentile (median) of Cassandra write latency. This value should be similar across all nodes in the cluster. If some nodes have higher values than the rest of the cluster then they may have more connected @@ -194,22 +245,6 @@ These are the metrics available for this integration. continually increasing then the node may be experiencing problems completing compaction operations. - - ***`gauge.cassandra.Storage.Load.Count`*** (*gauge*)
Storage used for Cassandra data in bytes. Use this metric to see how much storage is being used for data by a Cassandra node. - - The value of this metric is influenced by: - - Total data stored into the database - - compaction behavior - - - `gauge.cassandra.Storage.TotalHints.Count` (*gauge*)
Total hints since node start. Indicates that write operations cannot be - delivered to a node, usually because a node is down. If this value is - increasing and all nodes are up then there may be some connectivity - issue between nodes in the cluster. - - - ***`gauge.cassandra.Storage.TotalHintsInProgress.Count`*** (*gauge*)
Total pending hints. Indicates that write operations cannot be - delivered to a node, usually because a node is down. If this value is - increasing and all nodes are up then there may be some connectivity - issue between nodes in the cluster. - #### Group jvm All of the following metrics are part of the `jvm` metric group. All of @@ -239,15 +274,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/cassandra/metrics.yaml b/cassandra/metrics.yaml index 4f155691b..503041818 100644 --- a/cassandra/metrics.yaml +++ b/cassandra/metrics.yaml @@ -1,5 +1,51 @@ # This file was generated in the Smart Agent repo and copied here, DO NOT EDIT HERE. +counter.cassandra.ClientRequest.CASRead.Latency.Count: + brief: Count of transactional read operations since server start + custom: true + description: Count of transactional read operations since server start. + metric_type: cumulative + monitor: collectd/cassandra + title: counter.cassandra.ClientRequest.CASRead.Latency.Count + +counter.cassandra.ClientRequest.CASRead.TotalLatency.Count: + brief: The total number of microseconds elapsed in servicing client transactional + read requests + custom: true + description: 'The total number of microseconds elapsed in servicing client transactional + read requests. + + + It can be devided by `counter.cassandra.ClientRequest.CASRead.Latency.Count` + + to find the real time transactional read latency.' + metric_type: cumulative + monitor: collectd/cassandra + title: counter.cassandra.ClientRequest.CASRead.TotalLatency.Count + +counter.cassandra.ClientRequest.CASWrite.Latency.Count: + brief: Count of transactional write operations since server start + custom: true + description: Count of transactional write operations since server start. + metric_type: cumulative + monitor: collectd/cassandra + title: counter.cassandra.ClientRequest.CASWrite.Latency.Count + +counter.cassandra.ClientRequest.CASWrite.TotalLatency.Count: + brief: The total number of microseconds elapsed in servicing client transactional + write requests + custom: true + description: 'The total number of microseconds elapsed in servicing client transactional + write requests. + + + It can be devided by `counter.cassandra.ClientRequest.CASWrite.Latency.Count` + + to find the real time transactional write latency.' + metric_type: cumulative + monitor: collectd/cassandra + title: counter.cassandra.ClientRequest.CASWrite.TotalLatency.Count + counter.cassandra.ClientRequest.RangeSlice.Latency.Count: brief: Count of range slice operations since server start custom: false @@ -43,6 +89,14 @@ counter.cassandra.ClientRequest.RangeSlice.Timeouts.Count: monitor: collectd/cassandra title: counter.cassandra.ClientRequest.RangeSlice.Timeouts.Count +counter.cassandra.ClientRequest.RangeSlice.TotalLatency.Count: + brief: The total number of microseconds elapsed in servicing range slice requests + custom: true + description: The total number of microseconds elapsed in servicing range slice requests. + metric_type: cumulative + monitor: collectd/cassandra + title: counter.cassandra.ClientRequest.RangeSlice.TotalLatency.Count + counter.cassandra.ClientRequest.RangeSlice.Unavailables.Count: brief: Count of range slice unavailables since server start custom: false @@ -63,7 +117,7 @@ counter.cassandra.ClientRequest.RangeSlice.Unavailables.Count: counter.cassandra.ClientRequest.Read.Latency.Count: brief: Count of read operations since server start custom: false - description: Count of read operations since server start + description: Count of read operations since server start. metric_type: cumulative monitor: collectd/cassandra title: counter.cassandra.ClientRequest.Read.Latency.Count @@ -89,6 +143,20 @@ counter.cassandra.ClientRequest.Read.Timeouts.Count: monitor: collectd/cassandra title: counter.cassandra.ClientRequest.Read.Timeouts.Count +counter.cassandra.ClientRequest.Read.TotalLatency.Count: + brief: The total number of microseconds elapsed in servicing client read requests + custom: true + description: 'The total number of microseconds elapsed in servicing client read + requests. + + + It can be devided by `counter.cassandra.ClientRequest.Read.Latency.Count` + + to find the real time read latency.' + metric_type: cumulative + monitor: collectd/cassandra + title: counter.cassandra.ClientRequest.Read.TotalLatency.Count + counter.cassandra.ClientRequest.Read.Unavailables.Count: brief: Count of read unavailables since server start custom: false @@ -134,6 +202,20 @@ counter.cassandra.ClientRequest.Write.Timeouts.Count: monitor: collectd/cassandra title: counter.cassandra.ClientRequest.Write.Timeouts.Count +counter.cassandra.ClientRequest.Write.TotalLatency.Count: + brief: The total number of microseconds elapsed in servicing client write requests + custom: true + description: 'The total number of microseconds elapsed in servicing client write + requests. + + + It can be devided by `counter.cassandra.ClientRequest.Write.Latency.Count` + + to find the real time write latency.' + metric_type: cumulative + monitor: collectd/cassandra + title: counter.cassandra.ClientRequest.Write.TotalLatency.Count + counter.cassandra.ClientRequest.Write.Unavailables.Count: brief: Count of write unavailables since server start custom: false @@ -165,6 +247,108 @@ counter.cassandra.Compaction.TotalCompactionsCompleted.Count: monitor: collectd/cassandra title: counter.cassandra.Compaction.TotalCompactionsCompleted.Count +counter.cassandra.Storage.Exceptions.Count: + brief: Number of internal exceptions caught + custom: true + description: Number of internal exceptions caught. Under normal exceptions this + should be zero. + metric_type: cumulative + monitor: collectd/cassandra + title: counter.cassandra.Storage.Exceptions.Count + +counter.cassandra.Storage.Load.Count: + brief: Storage used for Cassandra data in bytes + custom: false + description: 'Storage used for Cassandra data in bytes. Use this metric to see how + much storage is being used for data by a Cassandra node. + + + The value of this metric is influenced by: + + - Total data stored into the database + + - compaction behavior' + metric_type: cumulative + monitor: collectd/cassandra + title: counter.cassandra.Storage.Load.Count + +counter.cassandra.Storage.TotalHints.Count: + brief: Total hints since node start + custom: true + description: 'Total hints since node start. Indicates that write operations cannot + be + + delivered to a node, usually because a node is down. If this value is + + increasing and all nodes are up then there may be some connectivity + + issue between nodes in the cluster.' + metric_type: cumulative + monitor: collectd/cassandra + title: counter.cassandra.Storage.TotalHints.Count + +counter.cassandra.Storage.TotalHintsInProgress.Count: + brief: Total pending hints + custom: false + description: 'Total pending hints. Indicates that write operations cannot be + + delivered to a node, usually because a node is down. If this value is + + increasing and all nodes are up then there may be some connectivity + + issue between nodes in the cluster.' + metric_type: cumulative + monitor: collectd/cassandra + title: counter.cassandra.Storage.TotalHintsInProgress.Count + +gauge.cassandra.ClientRequest.CASRead.Latency.50thPercentile: + brief: 50th percentile (median) of Cassandra transactional read latency + custom: true + description: 50th percentile (median) of Cassandra transactional read latency. + metric_type: gauge + monitor: collectd/cassandra + title: gauge.cassandra.ClientRequest.CASRead.Latency.50thPercentile + +gauge.cassandra.ClientRequest.CASRead.Latency.99thPercentile: + brief: 99th percentile of Cassandra transactional read latency + custom: true + description: 99th percentile of Cassandra transactional read latency. + metric_type: gauge + monitor: collectd/cassandra + title: gauge.cassandra.ClientRequest.CASRead.Latency.99thPercentile + +gauge.cassandra.ClientRequest.CASRead.Latency.Max: + brief: Maximum Cassandra transactional read latency + custom: true + description: Maximum Cassandra transactional read latency. + metric_type: gauge + monitor: collectd/cassandra + title: gauge.cassandra.ClientRequest.CASRead.Latency.Max + +gauge.cassandra.ClientRequest.CASWrite.Latency.50thPercentile: + brief: 50th percentile (median) of Cassandra transactional write latency + custom: true + description: 50th percentile (median) of Cassandra transactional write latency. + metric_type: gauge + monitor: collectd/cassandra + title: gauge.cassandra.ClientRequest.CASWrite.Latency.50thPercentile + +gauge.cassandra.ClientRequest.CASWrite.Latency.99thPercentile: + brief: 99th percentile of Cassandra transactional write latency + custom: true + description: 99th percentile of Cassandra transactional write latency. + metric_type: gauge + monitor: collectd/cassandra + title: gauge.cassandra.ClientRequest.CASWrite.Latency.99thPercentile + +gauge.cassandra.ClientRequest.CASWrite.Latency.Max: + brief: Maximum Cassandra transactional write latency + custom: true + description: Maximum Cassandra transactional write latency. + metric_type: gauge + monitor: collectd/cassandra + title: gauge.cassandra.ClientRequest.CASWrite.Latency.Max + gauge.cassandra.ClientRequest.RangeSlice.Latency.50thPercentile: brief: 50th percentile (median) of Cassandra range slice latency custom: true @@ -197,7 +381,7 @@ gauge.cassandra.ClientRequest.RangeSlice.Latency.99thPercentile: gauge.cassandra.ClientRequest.RangeSlice.Latency.Max: brief: Maximum Cassandra range slice latency custom: true - description: Maximum Cassandra range slice latency + description: Maximum Cassandra range slice latency. metric_type: gauge monitor: collectd/cassandra title: gauge.cassandra.ClientRequest.RangeSlice.Latency.Max @@ -233,7 +417,7 @@ gauge.cassandra.ClientRequest.Read.Latency.99thPercentile: gauge.cassandra.ClientRequest.Read.Latency.Max: brief: Maximum Cassandra read latency custom: false - description: Maximum Cassandra read latency + description: Maximum Cassandra read latency. metric_type: gauge monitor: collectd/cassandra title: gauge.cassandra.ClientRequest.Read.Latency.Max @@ -286,51 +470,6 @@ gauge.cassandra.Compaction.PendingTasks.Value: monitor: collectd/cassandra title: gauge.cassandra.Compaction.PendingTasks.Value -gauge.cassandra.Storage.Load.Count: - brief: Storage used for Cassandra data in bytes - custom: false - description: 'Storage used for Cassandra data in bytes. Use this metric to see how - much storage is being used for data by a Cassandra node. - - - The value of this metric is influenced by: - - - Total data stored into the database - - - compaction behavior' - metric_type: gauge - monitor: collectd/cassandra - title: gauge.cassandra.Storage.Load.Count - -gauge.cassandra.Storage.TotalHints.Count: - brief: Total hints since node start - custom: true - description: 'Total hints since node start. Indicates that write operations cannot - be - - delivered to a node, usually because a node is down. If this value is - - increasing and all nodes are up then there may be some connectivity - - issue between nodes in the cluster.' - metric_type: gauge - monitor: collectd/cassandra - title: gauge.cassandra.Storage.TotalHints.Count - -gauge.cassandra.Storage.TotalHintsInProgress.Count: - brief: Total pending hints - custom: false - description: 'Total pending hints. Indicates that write operations cannot be - - delivered to a node, usually because a node is down. If this value is - - increasing and all nodes are up then there may be some connectivity - - issue between nodes in the cluster.' - metric_type: gauge - monitor: collectd/cassandra - title: gauge.cassandra.Storage.TotalHintsInProgress.Count - gauge.jvm.threads.count: brief: Number of JVM threads custom: false diff --git a/collectd-cpu/SMART_AGENT_MONITOR.md b/collectd-cpu/SMART_AGENT_MONITOR.md index ac87ee76b..9d100b576 100644 --- a/collectd-cpu/SMART_AGENT_MONITOR.md +++ b/collectd-cpu/SMART_AGENT_MONITOR.md @@ -12,12 +12,16 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/cpu`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/cpu`. +Below is an overview of that monitor. ### Smart Agent Monitor +**This monitor is deprecated in favor of the `cpu` monitor. Please switch +to that monitor, as this monitor will be removed in a future agent +release.** + This monitor collects cpu usage data using the collectd `cpu` plugin. It aggregates the per-core CPU data into a single metric and sends it to the SignalFx Metadata plugin in collectd, where the @@ -39,7 +43,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** This monitor has no configuration options. @@ -85,15 +89,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/consul/SMART_AGENT_MONITOR.md b/consul/SMART_AGENT_MONITOR.md index f8134a04f..1bee0767b 100644 --- a/consul/SMART_AGENT_MONITOR.md +++ b/consul/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/consul`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/consul`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -62,7 +62,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -164,16 +164,16 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. ## Dimensions diff --git a/couchbase/metrics.yaml b/couchbase/metrics.yaml index a22e1bc83..5d258cb26 100644 --- a/couchbase/metrics.yaml +++ b/couchbase/metrics.yaml @@ -60,153 +60,1415 @@ gauge.bucket.basic.quotaPercentUsed: monitor: collectd/couchbase title: gauge.bucket.basic.quotaPercentUsed +gauge.bucket.hot_keys.0: + brief: One of the most used keys in this bucket + custom: true + description: One of the most used keys in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.hot_keys.0 + +gauge.bucket.hot_keys.1: + brief: One of the most used keys in this bucket + custom: true + description: One of the most used keys in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.hot_keys.1 + +gauge.bucket.hot_keys.10: + brief: One of the most used keys in this bucket + custom: true + description: One of the most used keys in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.hot_keys.10 + +gauge.bucket.hot_keys.2: + brief: One of the most used keys in this bucket + custom: true + description: One of the most used keys in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.hot_keys.2 + +gauge.bucket.hot_keys.3: + brief: One of the most used keys in this bucket + custom: true + description: One of the most used keys in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.hot_keys.3 + +gauge.bucket.hot_keys.4: + brief: One of the most used keys in this bucket + custom: true + description: One of the most used keys in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.hot_keys.4 + +gauge.bucket.hot_keys.5: + brief: One of the most used keys in this bucket + custom: true + description: One of the most used keys in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.hot_keys.5 + +gauge.bucket.hot_keys.6: + brief: One of the most used keys in this bucket + custom: true + description: One of the most used keys in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.hot_keys.6 + +gauge.bucket.hot_keys.7: + brief: One of the most used keys in this bucket + custom: true + description: One of the most used keys in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.hot_keys.7 + +gauge.bucket.hot_keys.8: + brief: One of the most used keys in this bucket + custom: true + description: One of the most used keys in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.hot_keys.8 + +gauge.bucket.hot_keys.9: + brief: One of the most used keys in this bucket + custom: true + description: One of the most used keys in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.hot_keys.9 + +gauge.bucket.op.avg_bg_wait_time: + brief: Average background wait time + custom: true + description: Average background wait time + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.avg_bg_wait_time + +gauge.bucket.op.avg_disk_commit_time: + brief: Average disk commit time + custom: true + description: Average disk commit time + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.avg_disk_commit_time + +gauge.bucket.op.avg_disk_update_time: + brief: Average disk update time + custom: true + description: Average disk update time + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.avg_disk_update_time + +gauge.bucket.op.bg_wait_count: + brief: '' + custom: true + description: '' + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.bg_wait_count + +gauge.bucket.op.bg_wait_total: + brief: The total background wait time + custom: true + description: The total background wait time + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.bg_wait_total + +gauge.bucket.op.bytes_read: + brief: Number of bytes read + custom: true + description: Number of bytes read + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.bytes_read + +gauge.bucket.op.bytes_written: + brief: Number of bytes written + custom: true + description: Number of bytes written + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.bytes_written + +gauge.bucket.op.cas_badval: + brief: Number of CAS operations per second using an incorrect CAS ID for data that + this bucket contains + custom: true + description: Number of CAS operations per second using an incorrect CAS ID for data + that this bucket contains + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.cas_badval + +gauge.bucket.op.cas_hits: + brief: Number of CAS operations per second for data that this bucket contains + custom: true + description: Number of CAS operations per second for data that this bucket contains + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.cas_hits + +gauge.bucket.op.cas_misses: + brief: Number of CAS operations per second for data that this bucket does not contain + custom: true + description: Number of CAS operations per second for data that this bucket does + not contain + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.cas_misses + gauge.bucket.op.cmd_get: brief: requested objects custom: false description: requested objects metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.cmd_get + title: gauge.bucket.op.cmd_get + +gauge.bucket.op.cmd_set: + brief: Number of writes (set operations) per second to this bucket + custom: true + description: Number of writes (set operations) per second to this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.cmd_set + +gauge.bucket.op.couch_docs_actual_disk_size: + brief: The size of the couchbase docs on disk + custom: true + description: The size of the couchbase docs on disk + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.couch_docs_actual_disk_size + +gauge.bucket.op.couch_docs_data_size: + brief: The size of active data in this bucket + custom: true + description: The size of active data in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.couch_docs_data_size + +gauge.bucket.op.couch_docs_disk_size: + brief: Couch docs total size in bytes + custom: true + description: Couch docs total size in bytes + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.couch_docs_disk_size + +gauge.bucket.op.couch_docs_fragmentation: + brief: Percent fragmentation of documents in this bucket + custom: false + description: Percent fragmentation of documents in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.couch_docs_fragmentation + +gauge.bucket.op.couch_spatial_data_size: + brief: The size of object data for spatial views + custom: true + description: The size of object data for spatial views + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.couch_spatial_data_size + +gauge.bucket.op.couch_spatial_disk_size: + brief: The amount of disk space occupied by spatial views + custom: true + description: The amount of disk space occupied by spatial views + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.couch_spatial_disk_size + +gauge.bucket.op.couch_spatial_ops: + brief: Number of spatial operations + custom: true + description: Number of spatial operations + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.couch_spatial_ops + +gauge.bucket.op.couch_total_disk_size: + brief: The total size on disk of all data and view files for this bucket + custom: true + description: The total size on disk of all data and view files for this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.couch_total_disk_size + +gauge.bucket.op.couch_views_actual_disk_size: + brief: The size of all active items in all the indexes for this bucket on disk + custom: true + description: The size of all active items in all the indexes for this bucket on + disk + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.couch_views_actual_disk_size + +gauge.bucket.op.couch_views_data_size: + brief: The size of object data for views + custom: true + description: The size of object data for views + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.couch_views_data_size + +gauge.bucket.op.couch_views_disk_size: + brief: The amount of disk space occupied by views + custom: true + description: The amount of disk space occupied by views + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.couch_views_disk_size + +gauge.bucket.op.couch_views_fragmentation: + brief: How much fragmented data there is to be compacted compared to real data for + the view index files in this bucket + custom: true + description: How much fragmented data there is to be compacted compared to real + data for the view index files in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.couch_views_fragmentation + +gauge.bucket.op.couch_views_ops: + brief: view operations per second + custom: false + description: view operations per second + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.couch_views_ops + +gauge.bucket.op.cpu_idle_ms: + brief: CPU Idle milliseconds + custom: true + description: CPU Idle milliseconds + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.cpu_idle_ms + +gauge.bucket.op.cpu_utilization_rate: + brief: Percentage of CPU in use across all available cores on this server + custom: true + description: Percentage of CPU in use across all available cores on this server + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.cpu_utilization_rate + +gauge.bucket.op.curr_connections: + brief: open connection per bucket + custom: false + description: open connection per bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.curr_connections + +gauge.bucket.op.curr_items: + brief: total number of stored items per bucket + custom: true + description: total number of stored items per bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.curr_items + +gauge.bucket.op.curr_items_tot: + brief: Total number of items + custom: true + description: Total number of items + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.curr_items_tot + +gauge.bucket.op.decr_hits: + brief: Number of decrement hits + custom: true + description: Number of decrement hits + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.decr_hits + +gauge.bucket.op.decr_misses: + brief: Number of decrement misses + custom: true + description: Number of decrement misses + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.decr_misses + +gauge.bucket.op.delete_hits: + brief: Number of delete hits + custom: true + description: Number of delete hits + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.delete_hits + +gauge.bucket.op.delete_misses: + brief: Number of delete misses + custom: true + description: Number of delete misses + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.delete_misses + +gauge.bucket.op.disk_commit_count: + brief: Number of disk commits + custom: true + description: Number of disk commits + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.disk_commit_count + +gauge.bucket.op.disk_commit_total: + brief: Total number of disk commits + custom: true + description: Total number of disk commits + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.disk_commit_total + +gauge.bucket.op.disk_update_count: + brief: Number of disk updates + custom: true + description: Number of disk updates + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.disk_update_count + +gauge.bucket.op.disk_update_total: + brief: Total number of disk updates + custom: true + description: Total number of disk updates + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.disk_update_total + +gauge.bucket.op.disk_write_queue: + brief: number of items waiting to be written to disk + custom: true + description: number of items waiting to be written to disk + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.disk_write_queue + +gauge.bucket.op.ep_bg_fetched: + brief: number of items fetched from disk + custom: false + description: number of items fetched from disk + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_bg_fetched + +gauge.bucket.op.ep_cache_miss_rate: + brief: ratio of requested objects found in cache vs retrieved from disk + custom: false + description: ratio of requested objects found in cache vs retrieved from disk + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_cache_miss_rate + +gauge.bucket.op.ep_dcp_2i_backoff: + brief: Number of backoffs for indexes DCP connections + custom: true + description: Number of backoffs for indexes DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_2i_backoff + +gauge.bucket.op.ep_dcp_2i_count: + brief: Number of indexes DCP connections + custom: true + description: Number of indexes DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_2i_count + +gauge.bucket.op.ep_dcp_2i_items_remaining: + brief: Number of indexes items remaining to be sent + custom: true + description: Number of indexes items remaining to be sent + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_2i_items_remaining + +gauge.bucket.op.ep_dcp_2i_items_sent: + brief: Number of indexes items sent + custom: true + description: Number of indexes items sent + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_2i_items_sent + +gauge.bucket.op.ep_dcp_2i_producer_count: + brief: Number of indexes producers + custom: true + description: Number of indexes producers + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_2i_producer_count + +gauge.bucket.op.ep_dcp_2i_total_backlog_size: + brief: Number of items in dcp backlog + custom: true + description: Number of items in dcp backlog + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_2i_total_backlog_size + +gauge.bucket.op.ep_dcp_2i_total_bytes: + brief: Number bytes per second being sent for indexes DCP connections + custom: true + description: Number bytes per second being sent for indexes DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_2i_total_bytes + +gauge.bucket.op.ep_dcp_other_backoff: + brief: Number of backoffs for other DCP connections + custom: true + description: Number of backoffs for other DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_other_backoff + +gauge.bucket.op.ep_dcp_other_count: + brief: Number of other DCP connections + custom: true + description: Number of other DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_other_count + +gauge.bucket.op.ep_dcp_other_items_remaining: + brief: Number of other items remaining to be sent + custom: true + description: Number of other items remaining to be sent + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_other_items_remaining + +gauge.bucket.op.ep_dcp_other_items_sent: + brief: Number of other items sent + custom: true + description: Number of other items sent + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_other_items_sent + +gauge.bucket.op.ep_dcp_other_producer_count: + brief: Number of other producers + custom: true + description: Number of other producers + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_other_producer_count + +gauge.bucket.op.ep_dcp_other_total_backlog_size: + brief: Number of remaining items for replication + custom: true + description: Number of remaining items for replication + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_other_total_backlog_size + +gauge.bucket.op.ep_dcp_other_total_bytes: + brief: Number bytes per second being sent for other DCP connections + custom: true + description: Number bytes per second being sent for other DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_other_total_bytes + +gauge.bucket.op.ep_dcp_replica_backoff: + brief: Number of backoffs for replica DCP connections + custom: true + description: Number of backoffs for replica DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_replica_backoff + +gauge.bucket.op.ep_dcp_replica_count: + brief: Number of replica DCP connections + custom: true + description: Number of replica DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_replica_count + +gauge.bucket.op.ep_dcp_replica_items_remaining: + brief: Number of replica items remaining to be sent + custom: true + description: Number of replica items remaining to be sent + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_replica_items_remaining + +gauge.bucket.op.ep_dcp_replica_items_sent: + brief: Number of replica items sent + custom: true + description: Number of replica items sent + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_replica_items_sent + +gauge.bucket.op.ep_dcp_replica_producer_count: + brief: Number of replica producers + custom: true + description: Number of replica producers + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_replica_producer_count + +gauge.bucket.op.ep_dcp_replica_total_backlog_size: + brief: Number of remaining items for replication + custom: true + description: Number of remaining items for replication + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_replica_total_backlog_size + +gauge.bucket.op.ep_dcp_replica_total_bytes: + brief: Number bytes per second being sent for replica DCP connections + custom: true + description: Number bytes per second being sent for replica DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_replica_total_bytes + +gauge.bucket.op.ep_dcp_views_backoff: + brief: Number of backoffs for views DCP connections + custom: true + description: Number of backoffs for views DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_views_backoff + +gauge.bucket.op.ep_dcp_views_count: + brief: Number of views DCP connections + custom: true + description: Number of views DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_views_count + +gauge.bucket.op.ep_dcp_views_items_remaining: + brief: Number of views items remaining to be sent + custom: true + description: Number of views items remaining to be sent + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_views_items_remaining + +gauge.bucket.op.ep_dcp_views_items_sent: + brief: Number of view items sent + custom: true + description: Number of view items sent + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_views_items_sent + +gauge.bucket.op.ep_dcp_views_producer_count: + brief: Number of views producers + custom: true + description: Number of views producers + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_views_producer_count + +gauge.bucket.op.ep_dcp_views_total_bytes: + brief: Number bytes per second being sent for views DCP connections + custom: true + description: Number bytes per second being sent for views DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_views_total_bytes + +gauge.bucket.op.ep_dcp_xdcr_backoff: + brief: Number of backoffs for xdcr DCP connections + custom: true + description: Number of backoffs for xdcr DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_xdcr_backoff + +gauge.bucket.op.ep_dcp_xdcr_count: + brief: Number of xdcr DCP connections + custom: true + description: Number of xdcr DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_xdcr_count + +gauge.bucket.op.ep_dcp_xdcr_items_remaining: + brief: Number of xdcr items remaining to be sent + custom: true + description: Number of xdcr items remaining to be sent + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_xdcr_items_remaining + +gauge.bucket.op.ep_dcp_xdcr_items_sent: + brief: Number of xdcr items sent + custom: true + description: Number of xdcr items sent + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_xdcr_items_sent + +gauge.bucket.op.ep_dcp_xdcr_producer_count: + brief: Number of xdcr producers + custom: true + description: Number of xdcr producers + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_xdcr_producer_count + +gauge.bucket.op.ep_dcp_xdcr_total_backlog_size: + brief: Number of items waiting replication + custom: true + description: Number of items waiting replication + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_xdcr_total_backlog_size + +gauge.bucket.op.ep_dcp_xdcr_total_bytes: + brief: Number bytes per second being sent for xdcr DCP connections + custom: true + description: Number bytes per second being sent for xdcr DCP connections + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_dcp_xdcr_total_bytes + +gauge.bucket.op.ep_diskqueue_drain: + brief: items removed from disk queue + custom: false + description: items removed from disk queue + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_diskqueue_drain + +gauge.bucket.op.ep_diskqueue_fill: + brief: enqueued items on disk queue + custom: false + description: enqueued items on disk queue + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_diskqueue_fill + +gauge.bucket.op.ep_diskqueue_items: + brief: The number of items waiting to be written to disk for this bucket for this + state + custom: true + description: The number of items waiting to be written to disk for this bucket for + this state + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_diskqueue_items + +gauge.bucket.op.ep_flusher_todo: + brief: Number of items currently being written + custom: true + description: Number of items currently being written + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_flusher_todo + +gauge.bucket.op.ep_item_commit_failed: + brief: Number of times a transaction failed to commit due to storage errors + custom: true + description: Number of times a transaction failed to commit due to storage errors + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_item_commit_failed + +gauge.bucket.op.ep_kv_size: + brief: Total amount of user data cached in RAM in this bucket + custom: true + description: Total amount of user data cached in RAM in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_kv_size + +gauge.bucket.op.ep_max_size: + brief: The maximum amount of memory this bucket can use + custom: true + description: The maximum amount of memory this bucket can use + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_max_size + +gauge.bucket.op.ep_mem_high_wat: + brief: memory high water mark - point at which active objects begin to be ejected + from bucket + custom: false + description: memory high water mark - point at which active objects begin to be + ejected from bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_mem_high_wat + +gauge.bucket.op.ep_mem_low_wat: + brief: memory low water mark + custom: true + description: memory low water mark + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_mem_low_wat + +gauge.bucket.op.ep_meta_data_memory: + brief: Total amount of item metadata consuming RAM in this bucket + custom: true + description: Total amount of item metadata consuming RAM in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_meta_data_memory + +gauge.bucket.op.ep_num_non_resident: + brief: Number of non-resident items + custom: true + description: Number of non-resident items + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_num_non_resident + +gauge.bucket.op.ep_num_ops_del_meta: + brief: Number of delete operations per second for this bucket as the target for + XDCR + custom: true + description: Number of delete operations per second for this bucket as the target + for XDCR + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_num_ops_del_meta + +gauge.bucket.op.ep_num_ops_del_ret_meta: + brief: Number of delRetMeta operations per second for this bucket as the target + for XDCR + custom: true + description: Number of delRetMeta operations per second for this bucket as the target + for XDCR + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_num_ops_del_ret_meta + +gauge.bucket.op.ep_num_ops_get_meta: + brief: Number of read operations per second for this bucket as the target for XDCR + custom: true + description: Number of read operations per second for this bucket as the target + for XDCR + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_num_ops_get_meta + +gauge.bucket.op.ep_num_ops_set_meta: + brief: Number of set operations per second for this bucket as the target for XDCR + custom: true + description: Number of set operations per second for this bucket as the target for + XDCR + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_num_ops_set_meta + +gauge.bucket.op.ep_num_ops_set_ret_meta: + brief: Number of setRetMeta operations per second for this bucket as the target + for XDCR + custom: true + description: Number of setRetMeta operations per second for this bucket as the target + for XDCR + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_num_ops_set_ret_meta + +gauge.bucket.op.ep_num_value_ejects: + brief: number of objects ejected out of the bucket + custom: false + description: number of objects ejected out of the bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_num_value_ejects + +gauge.bucket.op.ep_oom_errors: + brief: request rejected - bucket is at quota, panic + custom: false + description: request rejected - bucket is at quota, panic + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_oom_errors + +gauge.bucket.op.ep_ops_create: + brief: Total number of new items being inserted into this bucket + custom: true + description: Total number of new items being inserted into this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_ops_create + +gauge.bucket.op.ep_ops_update: + brief: Number of update operations + custom: true + description: Number of update operations + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_ops_update + +gauge.bucket.op.ep_overhead: + brief: Extra memory used by transient data like persistence queues or checkpoints + custom: true + description: Extra memory used by transient data like persistence queues or checkpoints + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_overhead + +gauge.bucket.op.ep_queue_size: + brief: number of items queued for storage + custom: false + description: number of items queued for storage + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_queue_size + +gauge.bucket.op.ep_resident_items_rate: + brief: Number of resident items + custom: true + description: Number of resident items + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_resident_items_rate + +gauge.bucket.op.ep_tap_rebalance_count: + brief: Number of internal rebalancing TAP queues in this bucket + custom: true + description: Number of internal rebalancing TAP queues in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_rebalance_count + +gauge.bucket.op.ep_tap_rebalance_qlen: + brief: Number of items in the rebalance TAP queues in this bucket + custom: true + description: Number of items in the rebalance TAP queues in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_rebalance_qlen + +gauge.bucket.op.ep_tap_rebalance_queue_backfillremaining: + brief: Number of items in the backfill queues of rebalancing TAP connections to + this bucket + custom: true + description: Number of items in the backfill queues of rebalancing TAP connections + to this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_rebalance_queue_backfillremaining + +gauge.bucket.op.ep_tap_rebalance_queue_backoff: + brief: Number of back-offs received per second while sending data over rebalancing + TAP connections to this bucket + custom: true + description: Number of back-offs received per second while sending data over rebalancing + TAP connections to this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_rebalance_queue_backoff + +gauge.bucket.op.ep_tap_rebalance_queue_drain: + brief: Number of items per second being sent over rebalancing TAP connections to + this bucket, i.e + custom: true + description: Number of items per second being sent over rebalancing TAP connections + to this bucket, i.e. removed from queue. + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_rebalance_queue_drain + +gauge.bucket.op.ep_tap_rebalance_queue_itemondisk: + brief: Number of items still on disk to be loaded for rebalancing TAP connections + to this bucket + custom: true + description: Number of items still on disk to be loaded for rebalancing TAP connections + to this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_rebalance_queue_itemondisk + +gauge.bucket.op.ep_tap_rebalance_total_backlog_size: + brief: Number of items remaining for replication + custom: true + description: Number of items remaining for replication + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_rebalance_total_backlog_size + +gauge.bucket.op.ep_tap_replica_count: + brief: Number of internal replication TAP queues in this bucket + custom: true + description: Number of internal replication TAP queues in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_replica_count + +gauge.bucket.op.ep_tap_replica_qlen: + brief: Number of items in the replication TAP queues in this bucket + custom: true + description: Number of items in the replication TAP queues in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_replica_qlen + +gauge.bucket.op.ep_tap_replica_queue_backoff: + brief: Number of back-offs received per second while sending data over replication + TAP connections to this bucket + custom: true + description: Number of back-offs received per second while sending data over replication + TAP connections to this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_replica_queue_backoff + +gauge.bucket.op.ep_tap_replica_queue_drain: + brief: Number of items per second being sent over replication TAP connections to + this bucket, i.e + custom: true + description: Number of items per second being sent over replication TAP connections + to this bucket, i.e. removed from queue + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_replica_queue_drain + +gauge.bucket.op.ep_tap_replica_queue_itemondisk: + brief: Number of items still on disk to be loaded for replication TAP connections + to this bucket + custom: true + description: Number of items still on disk to be loaded for replication TAP connections + to this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_replica_queue_itemondisk + +gauge.bucket.op.ep_tap_replica_total_backlog_size: + brief: Number of remaining items for replication + custom: true + description: Number of remaining items for replication + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_replica_total_backlog_size + +gauge.bucket.op.ep_tap_total_count: + brief: Total number of internal TAP queues in this bucket + custom: true + description: Total number of internal TAP queues in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_total_count + +gauge.bucket.op.ep_tap_total_qlen: + brief: Total number of items in TAP queues in this bucket + custom: true + description: Total number of items in TAP queues in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_total_qlen + +gauge.bucket.op.ep_tap_total_queue_backfillremaining: + brief: Total number of items in the backfill queues of TAP connections to this bucket + custom: true + description: Total number of items in the backfill queues of TAP connections to + this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_total_queue_backfillremaining + +gauge.bucket.op.ep_tap_total_queue_backoff: + brief: Total number of back-offs received per second while sending data over TAP + connections to this bucket + custom: true + description: Total number of back-offs received per second while sending data over + TAP connections to this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_total_queue_backoff + +gauge.bucket.op.ep_tap_total_queue_drain: + brief: Total number of items per second being sent over TAP connections to this + bucket + custom: true + description: Total number of items per second being sent over TAP connections to + this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_total_queue_drain + +gauge.bucket.op.ep_tap_total_queue_fill: + brief: Total enqueued items in the queue + custom: true + description: Total enqueued items in the queue. + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_total_queue_fill + +gauge.bucket.op.ep_tap_total_queue_itemondisk: + brief: The number of items waiting to be written to disk for this bucket for this + state + custom: true + description: The number of items waiting to be written to disk for this bucket for + this state. + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_total_queue_itemondisk + +gauge.bucket.op.ep_tap_total_total_backlog_size: + brief: Number of remaining items for replication + custom: true + description: Number of remaining items for replication + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_total_total_backlog_size + +gauge.bucket.op.ep_tap_user_count: + brief: Number of internal user TAP queues in this bucket + custom: true + description: Number of internal user TAP queues in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_user_count + +gauge.bucket.op.ep_tap_user_qlen: + brief: Number of items in user TAP queues in this bucket + custom: true + description: Number of items in user TAP queues in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_user_qlen + +gauge.bucket.op.ep_tap_user_queue_backfillremaining: + brief: Number of items in the backfill queues of user TAP connections to this bucket + custom: true + description: Number of items in the backfill queues of user TAP connections to this + bucket. + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_user_queue_backfillremaining + +gauge.bucket.op.ep_tap_user_queue_backoff: + brief: Number of back-offs received per second while sending data over user TAP + connections to this bucket + custom: true + description: Number of back-offs received per second while sending data over user + TAP connections to this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_user_queue_backoff + +gauge.bucket.op.ep_tap_user_queue_drain: + brief: Number of items per second being sent over user TAP connections to this bucket, + i.e + custom: true + description: Number of items per second being sent over user TAP connections to + this bucket, i.e. removed from queue + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_user_queue_drain + +gauge.bucket.op.ep_tap_user_queue_fill: + brief: Number of items per second being put on the user TAP queues + custom: true + description: Number of items per second being put on the user TAP queues + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_user_queue_fill + +gauge.bucket.op.ep_tap_user_queue_itemondisk: + brief: Number of items still on disk to be loaded for client TAP connections to + this bucket + custom: true + description: Number of items still on disk to be loaded for client TAP connections + to this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_user_queue_itemondisk + +gauge.bucket.op.ep_tap_user_total_backlog_size: + brief: Number of remaining items for replication + custom: true + description: Number of remaining items for replication + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tap_user_total_backlog_size + +gauge.bucket.op.ep_tmp_oom_errors: + brief: request rejected - couchbase is making room by ejecting objects, try again + later + custom: false + description: request rejected - couchbase is making room by ejecting objects, try + again later + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_tmp_oom_errors + +gauge.bucket.op.ep_vb_total: + brief: Total number of vBuckets for this bucket + custom: true + description: Total number of vBuckets for this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.ep_vb_total + +gauge.bucket.op.evictions: + brief: Number of evictions + custom: true + description: Number of evictions + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.evictions + +gauge.bucket.op.get_hits: + brief: Number of get hits + custom: true + description: Number of get hits + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.get_hits + +gauge.bucket.op.get_misses: + brief: Number of get misses + custom: true + description: Number of get misses + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.get_misses + +gauge.bucket.op.hibernated_requests: + brief: Number of streaming requests now idle + custom: true + description: Number of streaming requests now idle + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.hibernated_requests + +gauge.bucket.op.hibernated_waked: + brief: Rate of streaming request wakeups + custom: true + description: Rate of streaming request wakeups + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.hibernated_waked + +gauge.bucket.op.hit_ratio: + brief: Hit ratio + custom: true + description: Hit ratio. + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.hit_ratio + +gauge.bucket.op.incr_hits: + brief: Number of increment hits + custom: true + description: Number of increment hits + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.incr_hits + +gauge.bucket.op.incr_misses: + brief: Number of increment misses + custom: true + description: Number of increment misses + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.incr_misses -gauge.bucket.op.couch_docs_fragmentation: - brief: Percent fragmentation of documents in this bucket - custom: false - description: Percent fragmentation of documents in this bucket. +gauge.bucket.op.mem_actual_free: + brief: Amount of RAM available + custom: true + description: Amount of RAM available metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.couch_docs_fragmentation + title: gauge.bucket.op.mem_actual_free -gauge.bucket.op.couch_views_ops: - brief: view operations per second - custom: false - description: view operations per second +gauge.bucket.op.mem_actual_used: + brief: Used memory + custom: true + description: Used memory metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.couch_views_ops + title: gauge.bucket.op.mem_actual_used -gauge.bucket.op.curr_connections: - brief: open connection per bucket +gauge.bucket.op.mem_free: + brief: Free memory + custom: true + description: Free memory + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.mem_free + +gauge.bucket.op.mem_total: + brief: Total available memory + custom: true + description: Total available memory + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.mem_total + +gauge.bucket.op.mem_used: + brief: memory used custom: false - description: open connection per bucket + description: memory used metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.curr_connections + title: gauge.bucket.op.mem_used -gauge.bucket.op.curr_items: - brief: total number of stored items per bucket +gauge.bucket.op.mem_used_sys: + brief: System memory usage custom: true - description: total number of stored items per bucket + description: System memory usage metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.curr_items + title: gauge.bucket.op.mem_used_sys -gauge.bucket.op.disk_write_queue: - brief: number of items waiting to be written to disk +gauge.bucket.op.misses: + brief: Total number of misses custom: true - description: number of items waiting to be written to disk + description: Total number of misses metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.disk_write_queue + title: gauge.bucket.op.misses -gauge.bucket.op.ep_bg_fetched: - brief: number of items fetched from disk - custom: false - description: number of items fetched from disk +gauge.bucket.op.ops: + brief: total of gets, sets, increment and decrement + custom: true + description: total of gets, sets, increment and decrement metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.ep_bg_fetched + title: gauge.bucket.op.ops -gauge.bucket.op.ep_cache_miss_rate: - brief: ratio of requested objects found in cache vs retrieved from disk - custom: false - description: ratio of requested objects found in cache vs retrieved from disk +gauge.bucket.op.rest_requests: + brief: Number of HTTP requests + custom: true + description: Number of HTTP requests metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.ep_cache_miss_rate + title: gauge.bucket.op.rest_requests -gauge.bucket.op.ep_diskqueue_drain: - brief: items removed from disk queue - custom: false - description: items removed from disk queue +gauge.bucket.op.swap_total: + brief: Total amount of swap available + custom: true + description: Total amount of swap available metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.ep_diskqueue_drain + title: gauge.bucket.op.swap_total -gauge.bucket.op.ep_diskqueue_fill: - brief: enqueued items on disk queue - custom: false - description: enqueued items on disk queue +gauge.bucket.op.swap_used: + brief: Amount of swap used + custom: true + description: Amount of swap used metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.ep_diskqueue_fill + title: gauge.bucket.op.swap_used -gauge.bucket.op.ep_mem_high_wat: - brief: memory high water mark - point at which active objects begin to be ejected - from bucket - custom: false - description: memory high water mark - point at which active objects begin to be - ejected from bucket +gauge.bucket.op.vb_active_eject: + brief: Number of items per second being ejected to disk from active vBuckets + custom: true + description: Number of items per second being ejected to disk from active vBuckets metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.ep_mem_high_wat + title: gauge.bucket.op.vb_active_eject -gauge.bucket.op.ep_mem_low_wat: - brief: memory low water mark +gauge.bucket.op.vb_active_itm_memory: + brief: Amount of active user data cached in RAM in this bucket custom: true - description: memory low water mark + description: Amount of active user data cached in RAM in this bucket metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.ep_mem_low_wat + title: gauge.bucket.op.vb_active_itm_memory -gauge.bucket.op.ep_num_value_ejects: - brief: number of objects ejected out of the bucket - custom: false - description: number of objects ejected out of the bucket +gauge.bucket.op.vb_active_meta_data_memory: + brief: Amount of active item metadata consuming RAM in this bucket + custom: true + description: Amount of active item metadata consuming RAM in this bucket metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.ep_num_value_ejects + title: gauge.bucket.op.vb_active_meta_data_memory -gauge.bucket.op.ep_oom_errors: - brief: request rejected - bucket is at quota, panic - custom: false - description: request rejected - bucket is at quota, panic +gauge.bucket.op.vb_active_num: + brief: Number of vBuckets in the active state for this bucket + custom: true + description: Number of vBuckets in the active state for this bucket metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.ep_oom_errors + title: gauge.bucket.op.vb_active_num -gauge.bucket.op.ep_queue_size: - brief: number of items queued for storage - custom: false - description: number of items queued for storage +gauge.bucket.op.vb_active_num_non_resident: + brief: Number of non resident vBuckets in the active state for this bucket + custom: true + description: Number of non resident vBuckets in the active state for this bucket metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.ep_queue_size + title: gauge.bucket.op.vb_active_num_non_resident -gauge.bucket.op.ep_tmp_oom_errors: - brief: request rejected - couchbase is making room by ejecting objects, try again - later - custom: false - description: request rejected - couchbase is making room by ejecting objects, try - again later +gauge.bucket.op.vb_active_ops_create: + brief: New items per second being inserted into active vBuckets in this bucket + custom: true + description: New items per second being inserted into active vBuckets in this bucket metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.ep_tmp_oom_errors + title: gauge.bucket.op.vb_active_ops_create -gauge.bucket.op.mem_used: - brief: memory used - custom: false - description: memory used +gauge.bucket.op.vb_active_ops_update: + brief: Number of items updated on active vBucket per second for this bucket + custom: true + description: Number of items updated on active vBucket per second for this bucket metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.mem_used + title: gauge.bucket.op.vb_active_ops_update -gauge.bucket.op.ops: - brief: total of gets, sets, increment and decrement +gauge.bucket.op.vb_active_queue_age: + brief: Sum of disk queue item age in milliseconds custom: true - description: total of gets, sets, increment and decrement + description: Sum of disk queue item age in milliseconds metric_type: gauge monitor: collectd/couchbase - title: gauge.bucket.op.ops + title: gauge.bucket.op.vb_active_queue_age + +gauge.bucket.op.vb_active_queue_drain: + brief: Number of active items per second being written to disk in this bucket + custom: true + description: Number of active items per second being written to disk in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_active_queue_drain + +gauge.bucket.op.vb_active_queue_fill: + brief: Number of active items per second being put on the active item disk queue + in this bucket + custom: true + description: Number of active items per second being put on the active item disk + queue in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_active_queue_fill + +gauge.bucket.op.vb_active_queue_size: + brief: Number of active items waiting to be written to disk in this bucket + custom: true + description: Number of active items waiting to be written to disk in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_active_queue_size gauge.bucket.op.vb_active_resident_items_ratio: brief: ratio of items kept in memory vs stored on disk @@ -216,10 +1478,286 @@ gauge.bucket.op.vb_active_resident_items_ratio: monitor: collectd/couchbase title: gauge.bucket.op.vb_active_resident_items_ratio +gauge.bucket.op.vb_avg_active_queue_age: + brief: Average age in seconds of active items in the active item queue for this + bucket + custom: true + description: Average age in seconds of active items in the active item queue for + this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_avg_active_queue_age + +gauge.bucket.op.vb_avg_pending_queue_age: + brief: Average age in seconds of pending items in the pending item queue for this + bucket and should be transient during rebalancing + custom: true + description: Average age in seconds of pending items in the pending item queue for + this bucket and should be transient during rebalancing + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_avg_pending_queue_age + +gauge.bucket.op.vb_avg_replica_queue_age: + brief: Average age in seconds of replica items in the replica item queue for this + bucket + custom: true + description: Average age in seconds of replica items in the replica item queue for + this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_avg_replica_queue_age + +gauge.bucket.op.vb_avg_total_queue_age: + brief: Average age of items in the queue + custom: true + description: Average age of items in the queue + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_avg_total_queue_age + +gauge.bucket.op.vb_pending_curr_items: + brief: Number of items in pending vBuckets in this bucket and should be transient + during rebalancing + custom: true + description: Number of items in pending vBuckets in this bucket and should be transient + during rebalancing + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_pending_curr_items + +gauge.bucket.op.vb_pending_eject: + brief: Number of items per second being ejected to disk from pending vBuckets + custom: true + description: Number of items per second being ejected to disk from pending vBuckets + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_pending_eject + +gauge.bucket.op.vb_pending_itm_memory: + brief: Amount of pending user data cached in RAM in this bucket and should be transient + during rebalancing + custom: true + description: Amount of pending user data cached in RAM in this bucket and should + be transient during rebalancing + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_pending_itm_memory + +gauge.bucket.op.vb_pending_meta_data_memory: + brief: Amount of pending item metadata consuming RAM in this bucket and should be + transient during rebalancing + custom: true + description: Amount of pending item metadata consuming RAM in this bucket and should + be transient during rebalancing + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_pending_meta_data_memory + +gauge.bucket.op.vb_pending_num: + brief: Number of pending items + custom: true + description: Number of pending items + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_pending_num + +gauge.bucket.op.vb_pending_num_non_resident: + brief: Number of non resident vBuckets in the pending state for this bucket + custom: true + description: Number of non resident vBuckets in the pending state for this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_pending_num_non_resident + +gauge.bucket.op.vb_pending_ops_create: + brief: New items per second being instead into pending vBuckets in this bucket and + should be transient during rebalancing + custom: true + description: New items per second being instead into pending vBuckets in this bucket + and should be transient during rebalancing + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_pending_ops_create + +gauge.bucket.op.vb_pending_ops_update: + brief: Number of items updated on pending vBucket per second for this bucket + custom: true + description: Number of items updated on pending vBucket per second for this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_pending_ops_update + +gauge.bucket.op.vb_pending_queue_age: + brief: Sum of disk pending queue item age in milliseconds + custom: true + description: Sum of disk pending queue item age in milliseconds + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_pending_queue_age + +gauge.bucket.op.vb_pending_queue_drain: + brief: Number of pending items per second being written to disk in this bucket and + should be transient during rebalancing + custom: true + description: Number of pending items per second being written to disk in this bucket + and should be transient during rebalancing + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_pending_queue_drain + +gauge.bucket.op.vb_pending_queue_fill: + brief: Total enqueued pending items on disk queue + custom: true + description: Total enqueued pending items on disk queue + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_pending_queue_fill + +gauge.bucket.op.vb_pending_queue_size: + brief: Number of pending items waiting to be written to disk in this bucket and + should be transient during rebalancing + custom: true + description: Number of pending items waiting to be written to disk in this bucket + and should be transient during rebalancing + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_pending_queue_size + +gauge.bucket.op.vb_pending_resident_items_ratio: + brief: Number of resident pending items + custom: true + description: Number of resident pending items + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_pending_resident_items_ratio + +gauge.bucket.op.vb_replica_curr_items: + brief: Number of in memory items + custom: true + description: Number of in memory items + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_replica_curr_items + +gauge.bucket.op.vb_replica_eject: + brief: Number of items per second being ejected to disk from replica vBuckets + custom: true + description: Number of items per second being ejected to disk from replica vBuckets + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_replica_eject + +gauge.bucket.op.vb_replica_itm_memory: + brief: Amount of replica user data cached in RAM in this bucket + custom: true + description: Amount of replica user data cached in RAM in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_replica_itm_memory + +gauge.bucket.op.vb_replica_meta_data_memory: + brief: Amount of replica item metadata consuming in RAM in this bucket + custom: true + description: Amount of replica item metadata consuming in RAM in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_replica_meta_data_memory + +gauge.bucket.op.vb_replica_num: + brief: Number of replica vBuckets + custom: true + description: Number of replica vBuckets + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_replica_num + +gauge.bucket.op.vb_replica_num_non_resident: + brief: Number of non resident vBuckets in the replica state for this bucket + custom: true + description: Number of non resident vBuckets in the replica state for this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_replica_num_non_resident + +gauge.bucket.op.vb_replica_ops_create: + brief: Number of replica create operations + custom: true + description: Number of replica create operations + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_replica_ops_create + +gauge.bucket.op.vb_replica_ops_update: + brief: Number of items updated on replica vBucket per second for this bucket + custom: true + description: Number of items updated on replica vBucket per second for this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_replica_ops_update + +gauge.bucket.op.vb_replica_queue_age: + brief: Sum of disk replica queue item age in milliseconds + custom: true + description: Sum of disk replica queue item age in milliseconds + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_replica_queue_age + +gauge.bucket.op.vb_replica_queue_drain: + brief: Total drained replica items in the queue + custom: true + description: Total drained replica items in the queue + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_replica_queue_drain + +gauge.bucket.op.vb_replica_queue_fill: + brief: Number of replica items per second being put on the replica item disk queue + in this bucket + custom: true + description: Number of replica items per second being put on the replica item disk + queue in this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_replica_queue_fill + +gauge.bucket.op.vb_replica_queue_size: + brief: Number of replica items in disk queue + custom: true + description: Number of replica items in disk queue + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_replica_queue_size + +gauge.bucket.op.vb_replica_resident_items_ratio: + brief: Percentage of replica items cached in RAM in this bucket + custom: true + description: Percentage of replica items cached in RAM in this bucket. + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_replica_resident_items_ratio + +gauge.bucket.op.vb_total_queue_age: + brief: Sum of disk queue item age in milliseconds + custom: true + description: Sum of disk queue item age in milliseconds + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.vb_total_queue_age + +gauge.bucket.op.xdc_ops: + brief: Cross datacenter replication operations per second for this bucket + custom: true + description: Cross datacenter replication operations per second for this bucket + metric_type: gauge + monitor: collectd/couchbase + title: gauge.bucket.op.xdc_ops + gauge.bucket.quota.ram: brief: Amount of RAM used by the bucket (bytes) custom: true - description: Amount of RAM used by the bucket (bytes). + description: Amount of RAM used by the bucket (bytes) metric_type: gauge monitor: collectd/couchbase title: gauge.bucket.quota.ram @@ -227,7 +1765,7 @@ gauge.bucket.quota.ram: gauge.bucket.quota.rawRAM: brief: Amount of raw RAM used by the bucket (bytes) custom: true - description: Amount of raw RAM used by the bucket (bytes). + description: Amount of raw RAM used by the bucket (bytes) metric_type: gauge monitor: collectd/couchbase title: gauge.bucket.quota.rawRAM @@ -241,9 +1779,9 @@ gauge.nodes.cmd_get: title: gauge.nodes.cmd_get gauge.nodes.couch_docs_actual_disk_size: - brief: Amount of disk space used by Couch docs + brief: Amount of disk space used by Couch docs (bytes) custom: false - description: Amount of disk space used by Couch docs.(bytes) + description: Amount of disk space used by Couch docs (bytes) metric_type: gauge monitor: collectd/couchbase title: gauge.nodes.couch_docs_actual_disk_size @@ -267,7 +1805,7 @@ gauge.nodes.couch_spatial_data_size: gauge.nodes.couch_spatial_disk_size: brief: Amount of disk space occupied by spatial views, in bytes custom: true - description: Amount of disk space occupied by spatial views, in bytes. + description: Amount of disk space occupied by spatial views, in bytes metric_type: gauge monitor: collectd/couchbase title: gauge.nodes.couch_spatial_disk_size @@ -275,7 +1813,7 @@ gauge.nodes.couch_spatial_disk_size: gauge.nodes.couch_views_actual_disk_size: brief: Amount of disk space occupied by Couch views (bytes) custom: true - description: Amount of disk space occupied by Couch views (bytes). + description: Amount of disk space occupied by Couch views (bytes) metric_type: gauge monitor: collectd/couchbase title: gauge.nodes.couch_views_actual_disk_size @@ -283,7 +1821,7 @@ gauge.nodes.couch_views_actual_disk_size: gauge.nodes.couch_views_data_size: brief: Size of object data for Couch views (bytes) custom: true - description: Size of object data for Couch views (bytes). + description: Size of object data for Couch views (bytes) metric_type: gauge monitor: collectd/couchbase title: gauge.nodes.couch_views_data_size @@ -323,7 +1861,7 @@ gauge.nodes.get_hits: gauge.nodes.mcdMemoryAllocated: brief: Amount of memcached memory allocated (bytes) custom: true - description: Amount of memcached memory allocated (bytes). + description: Amount of memcached memory allocated (bytes) metric_type: gauge monitor: collectd/couchbase title: gauge.nodes.mcdMemoryAllocated @@ -331,7 +1869,7 @@ gauge.nodes.mcdMemoryAllocated: gauge.nodes.mcdMemoryReserved: brief: Amount of memcached memory reserved (bytes) custom: true - description: Amount of memcached memory reserved (bytes). + description: Amount of memcached memory reserved (bytes) metric_type: gauge monitor: collectd/couchbase title: gauge.nodes.mcdMemoryReserved @@ -347,7 +1885,7 @@ gauge.nodes.mem_used: gauge.nodes.memoryFree: brief: Amount of memory free for the node (bytes) custom: true - description: Amount of memory free for the node (bytes). + description: Amount of memory free for the node (bytes) metric_type: gauge monitor: collectd/couchbase title: gauge.nodes.memoryFree @@ -355,7 +1893,7 @@ gauge.nodes.memoryFree: gauge.nodes.memoryTotal: brief: Total memory available to the node (bytes) custom: true - description: Total memory available to the node (bytes). + description: Total memory available to the node (bytes) metric_type: gauge monitor: collectd/couchbase title: gauge.nodes.memoryTotal @@ -497,9 +2035,9 @@ gauge.storage.ram.total: title: gauge.storage.ram.total gauge.storage.ram.used: - brief: Ram used by the cluster (bytes) + brief: RAM used by the cluster (bytes) custom: true - description: Ram used by the cluster (bytes) + description: RAM used by the cluster (bytes) metric_type: gauge monitor: collectd/couchbase title: gauge.storage.ram.used diff --git a/docker/SMART_AGENT_MONITOR.md b/docker/SMART_AGENT_MONITOR.md index 66988cd46..d47962241 100644 --- a/docker/SMART_AGENT_MONITOR.md +++ b/docker/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `docker-container-stats`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `docker-container-stats`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -49,7 +49,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -60,6 +60,7 @@ Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monit | `enableExtraNetworkMetrics` | no | `bool` | Whether it will send all extra network metrics as well. (**default:** `false`) | | `dockerURL` | no | `string` | The URL of the docker server (**default:** `unix:///var/run/docker.sock`) | | `timeoutSeconds` | no | `integer` | The maximum amount of time to wait for docker API requests (**default:** `5`) | +| `cacheSyncInterval` | no | `integer` | The time to wait before resyncing the list of containers the monitor maintains through the docker event listener example: cacheSyncInterval: "20m" (**default:** `60m`) | | `labelsToDimensions` | no | `map of strings` | A mapping of container label names to dimension names. The corresponding label values will become the dimension value for the mapped name. E.g. `io.kubernetes.container.name: container_spec_name` would result in a dimension called `container_spec_name` that has the value of the `io.kubernetes.container.name` container label. | | `envToDimensions` | no | `map of strings` | A mapping of container environment variable names to dimension names. The corresponding env var values become the dimension values on the emitted metrics. E.g. `APP_VERSION: version` would result in datapoints having a dimension called `version` whose value is the value of the `APP_VERSION` envvar configured for that particular container, if present. | | `excludedImages` | no | `list of strings` | A list of filters of images to exclude. Supports literals, globs, and regex. | @@ -170,11 +171,11 @@ monitor config option `extraGroups`: - `memory.stats.writeback` (*gauge*)
The amount of memory from file/anon cache that are queued for syncing to the disk - ***`memory.usage.limit`*** (*gauge*)
Memory usage limit of the container, in bytes - `memory.usage.max` (*gauge*)
Maximum measured memory usage of the container, in bytes - - ***`memory.usage.total`*** (*gauge*)
Bytes of memory used by the container. Note that this **includes the - buffer cache** attributed to the process by the kernel from files that - have been read by processes in the container. If you don't want to - count that when monitoring containers, enable the metric - `memory.stats.total_cache` and subtract that metric from this one. + - ***`memory.usage.total`*** (*gauge*)
Bytes of memory used by the container. Note that this **excludes** the + buffer cache accounted to the process by the kernel from files that + have been read by processes in the container, as well as tmpfs usage. + If you want to count that when monitoring containers, enable the metric + `memory.stats.total_cache` and add it to this metric in SignalFlow. #### Group network @@ -205,15 +206,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/docker/metrics.yaml b/docker/metrics.yaml index e8a63590a..3837e6f23 100644 --- a/docker/metrics.yaml +++ b/docker/metrics.yaml @@ -708,15 +708,16 @@ memory.usage.max: memory.usage.total: brief: Bytes of memory used by the container custom: false - description: 'Bytes of memory used by the container. Note that this **includes the + description: 'Bytes of memory used by the container. Note that this **excludes** + the - buffer cache** attributed to the process by the kernel from files that + buffer cache accounted to the process by the kernel from files that - have been read by processes in the container. If you don''t want to + have been read by processes in the container, as well as tmpfs usage. - count that when monitoring containers, enable the metric + If you want to count that when monitoring containers, enable the metric - `memory.stats.total_cache` and subtract that metric from this one.' + `memory.stats.total_cache` and add it to this metric in SignalFlow.' metric_type: gauge monitor: docker-container-stats title: memory.usage.total diff --git a/etcd/SMART_AGENT_MONITOR.md b/etcd/SMART_AGENT_MONITOR.md index 2aca13ecf..0beca17da 100644 --- a/etcd/SMART_AGENT_MONITOR.md +++ b/etcd/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/etcd`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/etcd`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -35,7 +35,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -104,15 +104,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/gitlab/SMART_AGENT_MONITOR.md b/gitlab/SMART_AGENT_MONITOR.md index 2bc1a5d56..ac34a12e3 100644 --- a/gitlab/SMART_AGENT_MONITOR.md +++ b/gitlab/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `gitlab`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `gitlab`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -41,7 +41,7 @@ Follow the instructions [here](https://docs.gitlab.com/ee/administration/monitoring/prometheus/index.html) to configure the GitLab's Prometheus exporters to expose metric endpoint targets. For GitLab Runner monitoring configuration go -[here](https://docs.gitlab.com/runner/monitoring/README.html). +[here](https://docs.gitlab.com/runner/monitoring/index.html). Note that configuring GitLab by editing `/etc/gitlab/gitlab.rb` should be accompanied by running the command `gitlab-ctl reconfigure` in order for @@ -63,16 +63,16 @@ metrics are just targets `gitlab_monitor_database`, | Agent Monitor Type | Gitlab Doc | Standard Port | Standard Path | |-----------------------|------------------------------------------|---------------|---------------| | gitlab | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/gitlab_exporter.html) | 9168 | /metrics | -| [gitlab-gitaly](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./gitlab-gitaly.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/gitaly/#doc-nav) | 9236 | /metrics | -| [gitlab-sidekiq](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./gitlab-sidekiq.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/index.html) | 8082 | /metrics | -| [gitlab-unicorn](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./gitlab-unicorn.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/gitlab_metrics.html#unicorn-metrics-available) | 8080 | /-/metrics | -| [gitlab-workhorse](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./gitlab-workhorse.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/index.html) | 9229 | /metrics | -| [prometheus/nginx-vts](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./prometheus-nginx-vts.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/index.html) | 8060 | /metrics | -| [prometheus/node](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./prometheus-node.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/node_exporter.html) | 9100 | /metrics | -| [promteheus/postgres](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./prometheus-postgres.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/postgres_exporter.html) | 9187 | /metrics | -| [prometheus/prometheus](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./prometheus-prometheus.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/index.html) | 9090 | /metrics | -| [prometheus/redis](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./prometheus-redis.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/redis_exporter.html) | 9121 | /metrics | -| [gitlab-runner](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./gitlab-runner.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/index.html) | 9252 | /metrics | +| [gitlab-gitaly](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./gitlab-gitaly.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/gitaly/#doc-nav) | 9236 | /metrics | +| [gitlab-sidekiq](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./gitlab-sidekiq.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/index.html) | 8082 | /metrics | +| [gitlab-unicorn](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./gitlab-unicorn.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/gitlab_metrics.html#unicorn-metrics-available) | 8080 | /-/metrics | +| [gitlab-workhorse](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./gitlab-workhorse.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/index.html) | 9229 | /metrics | +| [prometheus/nginx-vts](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./prometheus-nginx-vts.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/index.html) | 8060 | /metrics | +| [prometheus/node](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./prometheus-node.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/node_exporter.html) | 9100 | /metrics | +| [promteheus/postgres](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./prometheus-postgres.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/postgres_exporter.html) | 9187 | /metrics | +| [prometheus/prometheus](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./prometheus-prometheus.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/index.html) | 9090 | /metrics | +| [prometheus/redis](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./prometheus-redis.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/redis_exporter.html) | 9121 | /metrics | +| [gitlab-runner](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./gitlab-runner.md) | [Gitlab doc](https://docs.gitlab.com/ee/administration/monitoring/prometheus/index.html) | 9252 | /metrics | GitLab Prometheus exporters, Nginx and GitLab Runner must be configured to listen to IP address(es) that include the IP address of the host or docker @@ -171,7 +171,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -179,9 +179,10 @@ Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monit | `httpTimeout` | no | `integer` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -268,15 +269,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/hadoop/SMART_AGENT_MONITOR.md b/hadoop/SMART_AGENT_MONITOR.md index 5480cbedb..d5745698a 100644 --- a/hadoop/SMART_AGENT_MONITOR.md +++ b/hadoop/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/hadoop`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/hadoop`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -21,7 +21,7 @@ Below is an overview of that monitor.** Collects metrics about a Hadoop 2.0+ cluster using the [collectd Hadoop Python plugin](https://github.com/signalfx/collectd-hadoop). If a remote JMX port is exposed in the hadoop cluster, then you may also configure the -[collectd/hadoopjmx](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./collectd-hadoopjmx.md) monitor to collect additional +[collectd/hadoopjmx](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./collectd-hadoopjmx.md) monitor to collect additional metrics about the hadoop cluster. The `collectd/hadoop` monitor will collect metrics from the Resource Manager @@ -67,7 +67,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -314,15 +314,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/java/SMART_AGENT_MONITOR.md b/java/SMART_AGENT_MONITOR.md index bbaa8ff57..bdf550335 100644 --- a/java/SMART_AGENT_MONITOR.md +++ b/java/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/genericjmx`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/genericjmx`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -129,7 +129,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -207,15 +207,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/jenkins/SMART_AGENT_MONITOR.md b/jenkins/SMART_AGENT_MONITOR.md index e675c5420..6e8baaaaf 100644 --- a/jenkins/SMART_AGENT_MONITOR.md +++ b/jenkins/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/jenkins`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/jenkins`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -22,9 +22,9 @@ Monitors jenkins by using the [jenkins collectd Python plugin](https://github.com/signalfx/collectd-jenkins), which collects metrics from Jenkins instances by hitting these endpoints: -[../api/json](https://wiki.jenkins.io/display/jenkins/remote+access+api) +[../api/json](https://www.jenkins.io/doc/book/using/remote-access-api/) (job metrics) and -[metrics/<MetricsKey>/..](https://wiki.jenkins.io/display/JENKINS/Metrics+Plugin) +[metrics/<MetricsKey>/..](https://plugins.jenkins.io/metrics/) (default and optional Codahale/Dropwizard JVM metrics). Requires Jenkins 1.580.3 or later, as well as the Jenkins Metrics Plugin (see Setup). @@ -85,7 +85,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -93,6 +93,7 @@ Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monit | `pythonBinary` | no | `string` | Path to a python binary that should be used to execute the Python code. If not set, a built-in runtime will be used. Can include arguments to the binary as well. | | `host` | **yes** | `string` | | | `port` | **yes** | `integer` | | +| `path` | no | `string` | | | `metricsKey` | **yes** | `string` | Key required for collecting metrics. The access key located at `Manage Jenkins > Configure System > Metrics > ADD.` If empty, click `Generate`. | | `enhancedMetrics` | no | `bool` | Whether to enable enhanced metrics (**default:** `false`) | | `excludeJobMetrics` | no | `bool` | Set to *true* to exclude job metrics retrieved from `/api/json` endpoint (**default:** `false`) | @@ -129,30 +130,5 @@ These are the metrics available for this integration. - ***`gauge.jenkins.node.vm.memory.non-heap.used`*** (*gauge*)
Total amount of non-heap memory used - ***`gauge.jenkins.node.vm.memory.total.used`*** (*gauge*)
Total Memory used by instance -### Non-default metrics (version 4.7.0+) - -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - -To emit metrics that are not _default_, you can add those metrics in the -generic monitor-level `extraMetrics` config option. Metrics that are derived -from specific configuration options that do not appear in the above list of -metrics do not need to be added to `extraMetrics`. - -To see a list of metrics that will be emitted you can run `agent-status -monitors` after configuring this monitor in a running agent instance. - -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - +The agent does not do any built-in filtering of metrics coming out of this +monitor. diff --git a/jenkins/metrics.yaml b/jenkins/metrics.yaml index 24e1efb46..c686758b7 100644 --- a/jenkins/metrics.yaml +++ b/jenkins/metrics.yaml @@ -2,7 +2,7 @@ gauge.jenkins.job.duration: brief: Time taken to complete the job in ms - custom: false + custom: true description: Time taken to complete the job in ms. metric_type: gauge monitor: collectd/jenkins @@ -10,7 +10,7 @@ gauge.jenkins.job.duration: gauge.jenkins.node.executor.count.value: brief: Total Number of executors in an instance - custom: false + custom: true description: Total Number of executors in an instance metric_type: gauge monitor: collectd/jenkins @@ -18,7 +18,7 @@ gauge.jenkins.node.executor.count.value: gauge.jenkins.node.executor.in-use.value: brief: Total number of executors being used in an instance - custom: false + custom: true description: Total number of executors being used in an instance metric_type: gauge monitor: collectd/jenkins @@ -26,7 +26,7 @@ gauge.jenkins.node.executor.in-use.value: gauge.jenkins.node.health-check.score: brief: Mean health score of an instance - custom: false + custom: true description: Mean health score of an instance metric_type: gauge monitor: collectd/jenkins @@ -34,7 +34,7 @@ gauge.jenkins.node.health-check.score: gauge.jenkins.node.health.disk.space: brief: Binary value of disk space health - custom: false + custom: true description: Binary value of disk space health metric_type: gauge monitor: collectd/jenkins @@ -42,7 +42,7 @@ gauge.jenkins.node.health.disk.space: gauge.jenkins.node.health.plugins: brief: Boolean value indicating state of plugins - custom: false + custom: true description: Boolean value indicating state of plugins metric_type: gauge monitor: collectd/jenkins @@ -50,7 +50,7 @@ gauge.jenkins.node.health.plugins: gauge.jenkins.node.health.temporary.space: brief: Binary value of temporary space health - custom: false + custom: true description: Binary value of temporary space health metric_type: gauge monitor: collectd/jenkins @@ -58,7 +58,7 @@ gauge.jenkins.node.health.temporary.space: gauge.jenkins.node.health.thread-deadlock: brief: Boolean value indicating a deadlock - custom: false + custom: true description: Boolean value indicating a deadlock metric_type: gauge monitor: collectd/jenkins @@ -66,7 +66,7 @@ gauge.jenkins.node.health.thread-deadlock: gauge.jenkins.node.online.status: brief: Boolean value of instance is reachable or not - custom: false + custom: true description: Boolean value of instance is reachable or not metric_type: gauge monitor: collectd/jenkins @@ -74,7 +74,7 @@ gauge.jenkins.node.online.status: gauge.jenkins.node.queue.size.value: brief: Total number pending jobs in queue - custom: false + custom: true description: Total number pending jobs in queue metric_type: gauge monitor: collectd/jenkins @@ -82,7 +82,7 @@ gauge.jenkins.node.queue.size.value: gauge.jenkins.node.slave.online.status: brief: Boolean value for slave is reachable or not - custom: false + custom: true description: Boolean value for slave is reachable or not metric_type: gauge monitor: collectd/jenkins @@ -90,7 +90,7 @@ gauge.jenkins.node.slave.online.status: gauge.jenkins.node.vm.memory.heap.usage: brief: Percent utilization of the heap memory - custom: false + custom: true description: Percent utilization of the heap memory metric_type: gauge monitor: collectd/jenkins @@ -98,7 +98,7 @@ gauge.jenkins.node.vm.memory.heap.usage: gauge.jenkins.node.vm.memory.non-heap.used: brief: Total amount of non-heap memory used - custom: false + custom: true description: Total amount of non-heap memory used metric_type: gauge monitor: collectd/jenkins @@ -106,7 +106,7 @@ gauge.jenkins.node.vm.memory.non-heap.used: gauge.jenkins.node.vm.memory.total.used: brief: Total Memory used by instance - custom: false + custom: true description: Total Memory used by instance metric_type: gauge monitor: collectd/jenkins diff --git a/kafka/SMART_AGENT_MONITOR.md b/kafka/SMART_AGENT_MONITOR.md index 74c6dab66..c249ec464 100644 --- a/kafka/SMART_AGENT_MONITOR.md +++ b/kafka/SMART_AGENT_MONITOR.md @@ -12,20 +12,20 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/kafka`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/kafka`. +Below is an overview of that monitor. ### Smart Agent Monitor Monitors a Kafka instance using collectd's GenericJMX plugin. See the [collectd/genericjmx -monitor](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./collectd-genericjmx.md)[](sfx_link:java) for more information on +monitor](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./collectd-genericjmx.md)[](sfx_link:java) for more information on how to configure custom MBeans, as well as information on troubleshooting JMX setup. This monitor has a set of [built in MBeans -configured](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/kafka/mbeans.go) +configured](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/kafka/mbeans.go) for which it pulls metrics from Kafka's JMX endpoint. Note that this monitor supports Kafka v0.8.2.x and above. For Kafka v1.x.x and above, @@ -50,7 +50,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -154,15 +154,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/kong/SMART_AGENT_MONITOR.md b/kong/SMART_AGENT_MONITOR.md index e161c82d5..d4d063a69 100644 --- a/kong/SMART_AGENT_MONITOR.md +++ b/kong/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/kong`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/kong`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -62,7 +62,7 @@ This plugin requires: | Software | Version | |-------------------|----------------| -| Kong | 0.11.2+ | +| Kong Community Edition (CE) | 0.11.2+ | | Configured [kong-plugin-signalfx](https://github.com/signalfx/kong-plugin-signalfx) | 0.0.1+ | @@ -83,7 +83,7 @@ monitors: report: false ``` -Sample YAML configuration with custom /signalfx route and white and blacklists +Sample YAML configuration with custom /signalfx route and filter lists ```yaml monitors: @@ -121,7 +121,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -208,15 +208,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/lib-java/README.md b/lib-java/README.md index a9a62a1b9..dc4f2d7f1 100644 --- a/lib-java/README.md +++ b/lib-java/README.md @@ -1,4 +1,4 @@ -# ![](./img/integrations_java.png) Java instrumentation for Splunk Observability Cloud +# ![](./img/integrations_java.png) Java client library for SignalFx - [Description](#description) @@ -7,14 +7,12 @@ ### DESCRIPTION -Java instrumentation is available from the -[Splunk Distribution of OpenTelemetry Java](https://github.com/signalfx/splunk-otel-java). This instrumentation agent provides -feature-rich auto-instrumentation with very little manual configuration. -It is built on top of the industry standard [OpenTelemetry](https://opentelemetry.io/). +This repository contains libraries for instrumenting Java applications and reporting metrics to Splunk Infrastructure Monitoring, using Codahale Metrics. + +You can also use the module `signalfx-java` to send metrics directly to Splunk Infrastructure Monitoring using protocol buffers, without using Codahale or Yammer metrics. + +For more information regarding installation, usage, and examples see https://github.com/signalfx/signalfx-java -Users building manual instrumentation can also leverage the -[OpenTelemetry Java SDK](https://github.com/open-telemetry/opentelemetry-java) -to send telemetry to Splunk. ### LICENSE diff --git a/logstash/SMART_AGENT_MONITOR.md b/logstash/SMART_AGENT_MONITOR.md index 51ca4201d..8d7786b63 100644 --- a/logstash/SMART_AGENT_MONITOR.md +++ b/logstash/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitors `logstash`, `logstash-tcp`. -Below is an overview of the monitors.** +This integration primarily consists of the Smart Agent monitors `logstash`, `logstash-tcp`. +Below is an overview of the monitors. ### Smart Agent Monitors @@ -122,7 +122,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -142,7 +142,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -298,15 +298,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/marathon/SMART_AGENT_MONITOR.md b/marathon/SMART_AGENT_MONITOR.md index 4322f753b..2186b509c 100644 --- a/marathon/SMART_AGENT_MONITOR.md +++ b/marathon/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/marathon`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/marathon`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -59,7 +59,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -81,49 +81,24 @@ All metrics of this integration are emitted by default; however, **none are cate These are the metrics available for this integration. - - ***`gauge.marathon.app.cpu.allocated`*** (*gauge*)
Number of CPUs allocated to an application - - ***`gauge.marathon.app.cpu.allocated.per.instance`*** (*gauge*)
Configured number of CPUs allocated to each application instance - - `gauge.marathon.app.delayed` (*gauge*)
Indicates if the application is delayed or not - - `gauge.marathon.app.deployments.total` (*gauge*)
Number of application deployments - - ***`gauge.marathon.app.disk.allocated`*** (*gauge*)
Storage allocated to a Marathon application - - ***`gauge.marathon.app.disk.allocated.per.instance`*** (*gauge*)
Configured storage allocated each to application instance - - `gauge.marathon.app.gpu.allocated` (*gauge*)
GPU Allocated to a Marathon application - - `gauge.marathon.app.gpu.allocated.per.instance` (*gauge*)
Configured number of GPUs allocated to each application instance - - ***`gauge.marathon.app.instances.total`*** (*gauge*)
Number of application instances - - ***`gauge.marathon.app.memory.allocated`*** (*gauge*)
Memory Allocated to a Marathon application - - ***`gauge.marathon.app.memory.allocated.per.instance`*** (*gauge*)
Configured amount of memory allocated to each application instance - - ***`gauge.marathon.app.tasks.running`*** (*gauge*)
Number tasks running for an application - - ***`gauge.marathon.app.tasks.staged`*** (*gauge*)
Number tasks staged for an application - - ***`gauge.marathon.app.tasks.unhealthy`*** (*gauge*)
Number unhealthy tasks for an application - - ***`gauge.marathon.task.healthchecks.failing.total`*** (*gauge*)
The number of failing health checks for a task - - ***`gauge.marathon.task.healthchecks.passing.total`*** (*gauge*)
The number of passing health checks for a task - - `gauge.marathon.task.staged.time.elapsed` (*gauge*)
The amount of time the task spent in staging - - `gauge.marathon.task.start.time.elapsed` (*gauge*)
Time elapsed since the task started - -### Non-default metrics (version 4.7.0+) - -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - -To emit metrics that are not _default_, you can add those metrics in the -generic monitor-level `extraMetrics` config option. Metrics that are derived -from specific configuration options that do not appear in the above list of -metrics do not need to be added to `extraMetrics`. - -To see a list of metrics that will be emitted you can run `agent-status -monitors` after configuring this monitor in a running agent instance. - -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - + - ***`gauge.service.mesosphere.marathon.app.cpu.allocated`*** (*gauge*)
Number of CPUs allocated to an application + - ***`gauge.service.mesosphere.marathon.app.cpu.allocated.per.instance`*** (*gauge*)
Configured number of CPUs allocated to each application instance + - `gauge.service.mesosphere.marathon.app.delayed` (*gauge*)
Indicates if the application is delayed or not + - `gauge.service.mesosphere.marathon.app.deployments.total` (*gauge*)
Number of application deployments + - ***`gauge.service.mesosphere.marathon.app.disk.allocated`*** (*gauge*)
Storage allocated to a Marathon application + - ***`gauge.service.mesosphere.marathon.app.disk.allocated.per.instance`*** (*gauge*)
Configured storage allocated each to application instance + - `gauge.service.mesosphere.marathon.app.gpu.allocated` (*gauge*)
GPU Allocated to a Marathon application + - `gauge.service.mesosphere.marathon.app.gpu.allocated.per.instance` (*gauge*)
Configured number of GPUs allocated to each application instance + - ***`gauge.service.mesosphere.marathon.app.instances.total`*** (*gauge*)
Number of application instances + - ***`gauge.service.mesosphere.marathon.app.memory.allocated`*** (*gauge*)
Memory Allocated to a Marathon application + - ***`gauge.service.mesosphere.marathon.app.memory.allocated.per.instance`*** (*gauge*)
Configured amount of memory allocated to each application instance + - ***`gauge.service.mesosphere.marathon.app.tasks.running`*** (*gauge*)
Number tasks running for an application + - ***`gauge.service.mesosphere.marathon.app.tasks.staged`*** (*gauge*)
Number tasks staged for an application + - ***`gauge.service.mesosphere.marathon.app.tasks.unhealthy`*** (*gauge*)
Number unhealthy tasks for an application + - ***`gauge.service.mesosphere.marathon.task.healthchecks.failing.total`*** (*gauge*)
The number of failing health checks for a task + - ***`gauge.service.mesosphere.marathon.task.healthchecks.passing.total`*** (*gauge*)
The number of passing health checks for a task + - `gauge.service.mesosphere.marathon.task.staged.time.elapsed` (*gauge*)
The amount of time the task spent in staging + - `gauge.service.mesosphere.marathon.task.start.time.elapsed` (*gauge*)
Time elapsed since the task started + +The agent does not do any built-in filtering of metrics coming out of this +monitor. diff --git a/marathon/metrics.yaml b/marathon/metrics.yaml index 17a86fdb4..ca05675fa 100644 --- a/marathon/metrics.yaml +++ b/marathon/metrics.yaml @@ -1,146 +1,146 @@ # This file was generated in the Smart Agent repo and copied here, DO NOT EDIT HERE. -gauge.marathon.app.cpu.allocated: +gauge.service.mesosphere.marathon.app.cpu.allocated: brief: Number of CPUs allocated to an application - custom: false + custom: true description: Number of CPUs allocated to an application metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.cpu.allocated + title: gauge.service.mesosphere.marathon.app.cpu.allocated -gauge.marathon.app.cpu.allocated.per.instance: +gauge.service.mesosphere.marathon.app.cpu.allocated.per.instance: brief: Configured number of CPUs allocated to each application instance - custom: false + custom: true description: Configured number of CPUs allocated to each application instance metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.cpu.allocated.per.instance + title: gauge.service.mesosphere.marathon.app.cpu.allocated.per.instance -gauge.marathon.app.delayed: +gauge.service.mesosphere.marathon.app.delayed: brief: Indicates if the application is delayed or not custom: true description: Indicates if the application is delayed or not metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.delayed + title: gauge.service.mesosphere.marathon.app.delayed -gauge.marathon.app.deployments.total: +gauge.service.mesosphere.marathon.app.deployments.total: brief: Number of application deployments custom: true description: Number of application deployments metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.deployments.total + title: gauge.service.mesosphere.marathon.app.deployments.total -gauge.marathon.app.disk.allocated: +gauge.service.mesosphere.marathon.app.disk.allocated: brief: Storage allocated to a Marathon application - custom: false + custom: true description: Storage allocated to a Marathon application metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.disk.allocated + title: gauge.service.mesosphere.marathon.app.disk.allocated -gauge.marathon.app.disk.allocated.per.instance: +gauge.service.mesosphere.marathon.app.disk.allocated.per.instance: brief: Configured storage allocated each to application instance - custom: false + custom: true description: Configured storage allocated each to application instance metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.disk.allocated.per.instance + title: gauge.service.mesosphere.marathon.app.disk.allocated.per.instance -gauge.marathon.app.gpu.allocated: +gauge.service.mesosphere.marathon.app.gpu.allocated: brief: GPU Allocated to a Marathon application custom: true description: GPU Allocated to a Marathon application metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.gpu.allocated + title: gauge.service.mesosphere.marathon.app.gpu.allocated -gauge.marathon.app.gpu.allocated.per.instance: +gauge.service.mesosphere.marathon.app.gpu.allocated.per.instance: brief: Configured number of GPUs allocated to each application instance custom: true description: Configured number of GPUs allocated to each application instance metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.gpu.allocated.per.instance + title: gauge.service.mesosphere.marathon.app.gpu.allocated.per.instance -gauge.marathon.app.instances.total: +gauge.service.mesosphere.marathon.app.instances.total: brief: Number of application instances - custom: false + custom: true description: Number of application instances metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.instances.total + title: gauge.service.mesosphere.marathon.app.instances.total -gauge.marathon.app.memory.allocated: +gauge.service.mesosphere.marathon.app.memory.allocated: brief: Memory Allocated to a Marathon application - custom: false + custom: true description: Memory Allocated to a Marathon application metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.memory.allocated + title: gauge.service.mesosphere.marathon.app.memory.allocated -gauge.marathon.app.memory.allocated.per.instance: +gauge.service.mesosphere.marathon.app.memory.allocated.per.instance: brief: Configured amount of memory allocated to each application instance - custom: false + custom: true description: Configured amount of memory allocated to each application instance metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.memory.allocated.per.instance + title: gauge.service.mesosphere.marathon.app.memory.allocated.per.instance -gauge.marathon.app.tasks.running: +gauge.service.mesosphere.marathon.app.tasks.running: brief: Number tasks running for an application - custom: false + custom: true description: Number tasks running for an application metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.tasks.running + title: gauge.service.mesosphere.marathon.app.tasks.running -gauge.marathon.app.tasks.staged: +gauge.service.mesosphere.marathon.app.tasks.staged: brief: Number tasks staged for an application - custom: false + custom: true description: Number tasks staged for an application metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.tasks.staged + title: gauge.service.mesosphere.marathon.app.tasks.staged -gauge.marathon.app.tasks.unhealthy: +gauge.service.mesosphere.marathon.app.tasks.unhealthy: brief: Number unhealthy tasks for an application - custom: false + custom: true description: Number unhealthy tasks for an application metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.app.tasks.unhealthy + title: gauge.service.mesosphere.marathon.app.tasks.unhealthy -gauge.marathon.task.healthchecks.failing.total: +gauge.service.mesosphere.marathon.task.healthchecks.failing.total: brief: The number of failing health checks for a task - custom: false + custom: true description: The number of failing health checks for a task metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.task.healthchecks.failing.total + title: gauge.service.mesosphere.marathon.task.healthchecks.failing.total -gauge.marathon.task.healthchecks.passing.total: +gauge.service.mesosphere.marathon.task.healthchecks.passing.total: brief: The number of passing health checks for a task - custom: false + custom: true description: The number of passing health checks for a task metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.task.healthchecks.passing.total + title: gauge.service.mesosphere.marathon.task.healthchecks.passing.total -gauge.marathon.task.staged.time.elapsed: +gauge.service.mesosphere.marathon.task.staged.time.elapsed: brief: The amount of time the task spent in staging custom: true description: The amount of time the task spent in staging metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.task.staged.time.elapsed + title: gauge.service.mesosphere.marathon.task.staged.time.elapsed -gauge.marathon.task.start.time.elapsed: +gauge.service.mesosphere.marathon.task.start.time.elapsed: brief: Time elapsed since the task started custom: true description: Time elapsed since the task started metric_type: gauge monitor: collectd/marathon - title: gauge.marathon.task.start.time.elapsed + title: gauge.service.mesosphere.marathon.task.start.time.elapsed diff --git a/memcached/SMART_AGENT_MONITOR.md b/memcached/SMART_AGENT_MONITOR.md index 0d5913f28..a6dbe4379 100644 --- a/memcached/SMART_AGENT_MONITOR.md +++ b/memcached/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/memcached`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/memcached`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -44,7 +44,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -104,15 +104,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/mongodb/SMART_AGENT_MONITOR.md b/mongodb/SMART_AGENT_MONITOR.md index ea9753702..bc39e89f3 100644 --- a/mongodb/SMART_AGENT_MONITOR.md +++ b/mongodb/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/mongodb`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/mongodb`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -85,7 +85,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -115,7 +115,7 @@ These are the metrics available for this integration. - `counter.asserts.regular` (*cumulative*)
The number of regular assertions raised since the MongoDB process started. Check the log file for more information about these messages. - `counter.asserts.warning` (*cumulative*)
In MongoDB 3.x and earlier, the field returns the number of warnings raised since the MongoDB process started. In MongodDB 4, this is always 0. - - ***`counter.backgroundFlushing.flushes`*** (*gauge*)
Number of times the database has been flushed + - ***`counter.backgroundFlushing.flushes`*** (*gauge*)
Number of times the database has been flushed. Only available when MMAPv1 is enabled. (MMAPv1 is not supported in MongoDB version > 4.2) - ***`counter.extra_info.page_faults`*** (*gauge*)
Mongod page faults - `counter.lock.Database.acquireCount.intentExclusive` (*cumulative*)
- `counter.lock.Database.acquireCount.intentShared` (*cumulative*)
@@ -139,16 +139,19 @@ These are the metrics available for this integration. - `counter.opcountersRepl.insert` (*cumulative*)
Number of replicated inserts since last restart - `counter.opcountersRepl.query` (*cumulative*)
Number of replicated queries since last restart - `counter.opcountersRepl.update` (*cumulative*)
Number of replicated updates since last restart - - ***`gauge.backgroundFlushing.average_ms`*** (*gauge*)
Average time (ms) to write data to disk - - ***`gauge.backgroundFlushing.last_ms`*** (*gauge*)
Most recent time (ms) spent writing data to disk + - ***`gauge.backgroundFlushing.average_ms`*** (*gauge*)
Average time (ms) to write data to disk. Only available when MMAPv1 is enabled. (MMAPv1 is not supported in MongoDB version > 4.2) + - ***`gauge.backgroundFlushing.last_ms`*** (*gauge*)
Most recent time (ms) spent writing data to disk. Only available when MMAPv1 is enabled. (MMAPv1 is not supported in MongoDB version > 4.2) - `gauge.collection.max` (*gauge*)
Maximum number of documents in a capped collection - `gauge.collection.maxSize` (*gauge*)
Maximum disk usage of a capped collection - `gauge.collections` (*gauge*)
Number of collections - - `gauge.connections.available` (*gauge*)
Number of available incoming connections - - ***`gauge.connections.current`*** (*gauge*)
Number of current client connections + - `gauge.connections.available` (*gauge*)
The number of unused incoming connections available. Consider this value + in combination with the value of `gauge.connections.current` to + understand the connection load on the database. + + - ***`gauge.connections.current`*** (*gauge*)
The number of incoming connections from clients to the database server. - `gauge.connections.totalCreated` (*cumulative*)
Count of all incoming connections created to the server. This number includes connections that have since closed. - ***`gauge.dataSize`*** (*gauge*)
Total size of data, in bytes - - ***`gauge.extra_info.heap_usage_bytes`*** (*gauge*)
Heap size used by the mongod process, in bytes + - ***`gauge.extra_info.heap_usage_bytes`*** (*gauge*)
Heap size used by the mongod process, in bytes. Deprecated in mongo version > 3.3, use gauge.tcmalloc.generic.heap_size instead. - ***`gauge.globalLock.activeClients.readers`*** (*gauge*)
Number of active client connections performing reads - `gauge.globalLock.activeClients.total` (*gauge*)
Total number of active client connections - ***`gauge.globalLock.activeClients.writers`*** (*gauge*)
Number of active client connections performing writes @@ -157,12 +160,19 @@ These are the metrics available for this integration. - ***`gauge.globalLock.currentQueue.writers`*** (*gauge*)
Write operations currently in queue - ***`gauge.indexSize`*** (*gauge*)
Total size of indexes, in bytes - `gauge.indexes` (*gauge*)
Number of indexes across all collections - - ***`gauge.mem.mapped`*** (*gauge*)
Mongodb mapped memory usage, in MB + - ***`gauge.mem.mapped`*** (*gauge*)
Mongodb mapped memory usage, in MB. Only available when MMAPv1 is enabled. (MMAPv1 is not supported in MongoDB version > 4.2) - ***`gauge.mem.resident`*** (*gauge*)
Mongodb resident memory usage, in MB - ***`gauge.mem.virtual`*** (*gauge*)
Mongodb virtual memory usage, in MB - `gauge.numExtents` (*gauge*)
- ***`gauge.objects`*** (*gauge*)
Number of documents across all collections + - ***`gauge.repl.active_nodes`*** (*gauge*)
Number of healthy members in a replicaset (reporting 1 for [health](https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#replSetGetStatus.members[n].health)). + - ***`gauge.repl.is_primary_node`*** (*gauge*)
Report 1 when member [state](https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#replSetGetStatus.members[n].stateStr) of replicaset is `PRIMARY` and 2 else. + - ***`gauge.repl.max_lag`*** (*gauge*)
Replica lag in seconds calculated from the difference between the + timestamp of the last oplog entry of primary and secondary [see mongo + doc](https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#replSetGetStatus.members[n].optimeDate). + - ***`gauge.storageSize`*** (*gauge*)
Total bytes allocated to collections for document storage + - ***`gauge.tcmalloc.generic.heap_size`*** (*gauge*)
Heap size used by the mongod process, in bytes. Same as gauge.extra_info.heap_usage_bytes but supports 64-bit values. - ***`gauge.uptime`*** (*counter*)
Uptime of this server in milliseconds #### Group collection @@ -214,16 +224,16 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. ## Dimensions diff --git a/mongodb/metrics.yaml b/mongodb/metrics.yaml index e0d2f3e56..fc95b7bd9 100644 --- a/mongodb/metrics.yaml +++ b/mongodb/metrics.yaml @@ -22,7 +22,8 @@ counter.asserts.warning: counter.backgroundFlushing.flushes: brief: Number of times the database has been flushed custom: false - description: Number of times the database has been flushed + description: Number of times the database has been flushed. Only available when + MMAPv1 is enabled. (MMAPv1 is not supported in MongoDB version > 4.2) metric_type: gauge monitor: collectd/mongodb title: counter.backgroundFlushing.flushes @@ -366,7 +367,8 @@ counter.opcountersRepl.update: gauge.backgroundFlushing.average_ms: brief: Average time (ms) to write data to disk custom: false - description: Average time (ms) to write data to disk + description: Average time (ms) to write data to disk. Only available when MMAPv1 + is enabled. (MMAPv1 is not supported in MongoDB version > 4.2) metric_type: gauge monitor: collectd/mongodb title: gauge.backgroundFlushing.average_ms @@ -374,7 +376,8 @@ gauge.backgroundFlushing.average_ms: gauge.backgroundFlushing.last_ms: brief: Most recent time (ms) spent writing data to disk custom: false - description: Most recent time (ms) spent writing data to disk + description: Most recent time (ms) spent writing data to disk. Only available when + MMAPv1 is enabled. (MMAPv1 is not supported in MongoDB version > 4.2) metric_type: gauge monitor: collectd/mongodb title: gauge.backgroundFlushing.last_ms @@ -444,17 +447,19 @@ gauge.collections: title: gauge.collections gauge.connections.available: - brief: Number of available incoming connections + brief: The number of unused incoming connections available custom: true - description: Number of available incoming connections + description: "The number of unused incoming connections available. Consider this\ + \ value \nin combination with the value of `gauge.connections.current` to \nunderstand\ + \ the connection load on the database." metric_type: gauge monitor: collectd/mongodb title: gauge.connections.available gauge.connections.current: - brief: Number of current client connections + brief: The number of incoming connections from clients to the database server custom: false - description: Number of current client connections + description: The number of incoming connections from clients to the database server. metric_type: gauge monitor: collectd/mongodb title: gauge.connections.current @@ -479,7 +484,8 @@ gauge.dataSize: gauge.extra_info.heap_usage_bytes: brief: Heap size used by the mongod process, in bytes custom: false - description: Heap size used by the mongod process, in bytes + description: Heap size used by the mongod process, in bytes. Deprecated in mongo + version > 3.3, use gauge.tcmalloc.generic.heap_size instead. metric_type: gauge monitor: collectd/mongodb title: gauge.extra_info.heap_usage_bytes @@ -551,7 +557,8 @@ gauge.indexes: gauge.mem.mapped: brief: Mongodb mapped memory usage, in MB custom: false - description: Mongodb mapped memory usage, in MB + description: Mongodb mapped memory usage, in MB. Only available when MMAPv1 is enabled. + (MMAPv1 is not supported in MongoDB version > 4.2) metric_type: gauge monitor: collectd/mongodb title: gauge.mem.mapped @@ -588,6 +595,34 @@ gauge.objects: monitor: collectd/mongodb title: gauge.objects +gauge.repl.active_nodes: + brief: Number of healthy members in a replicaset (reporting 1 for [health](https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#replSetGetStatus.members[n].health)) + custom: false + description: Number of healthy members in a replicaset (reporting 1 for [health](https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#replSetGetStatus.members[n].health)). + metric_type: gauge + monitor: collectd/mongodb + title: gauge.repl.active_nodes + +gauge.repl.is_primary_node: + brief: Report 1 when member [state](https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#replSetGetStatus.members[n].stateStr) + of replicaset is `PRIMARY` and 2 else + custom: false + description: Report 1 when member [state](https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#replSetGetStatus.members[n].stateStr) + of replicaset is `PRIMARY` and 2 else. + metric_type: gauge + monitor: collectd/mongodb + title: gauge.repl.is_primary_node + +gauge.repl.max_lag: + brief: "Replica lag in seconds calculated from the difference between the \ntimestamp\ + \ of the last oplog entry of primary and secondary [see mongo \ndoc](https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#replSetGetStatus.members[n].optimeDate)" + custom: false + description: "Replica lag in seconds calculated from the difference between the\ + \ \ntimestamp of the last oplog entry of primary and secondary [see mongo \ndoc](https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#replSetGetStatus.members[n].optimeDate)." + metric_type: gauge + monitor: collectd/mongodb + title: gauge.repl.max_lag + gauge.storageSize: brief: Total bytes allocated to collections for document storage custom: false @@ -596,6 +631,15 @@ gauge.storageSize: monitor: collectd/mongodb title: gauge.storageSize +gauge.tcmalloc.generic.heap_size: + brief: Heap size used by the mongod process, in bytes + custom: false + description: Heap size used by the mongod process, in bytes. Same as gauge.extra_info.heap_usage_bytes + but supports 64-bit values. + metric_type: gauge + monitor: collectd/mongodb + title: gauge.tcmalloc.generic.heap_size + gauge.uptime: brief: Uptime of this server in milliseconds custom: false diff --git a/mysql/metrics.yaml b/mysql/metrics.yaml index 6f7eef2df..f00955abe 100644 --- a/mysql/metrics.yaml +++ b/mysql/metrics.yaml @@ -50,6 +50,145 @@ cache_size.qcache: monitor: collectd/mysql title: cache_size.qcache +mysql_bpool_bytes.data: + brief: The total number of bytes in the InnoDB buffer pool containing data + custom: true + description: The total number of bytes in the InnoDB buffer pool containing data. + The number includes both dirty and clean pages. + metric_type: gauge + monitor: collectd/mysql + title: mysql_bpool_bytes.data + +mysql_bpool_bytes.dirty: + brief: The total current number of bytes held in dirty pages in the InnoDB buffer + pool + custom: true + description: The total current number of bytes held in dirty pages in the InnoDB + buffer pool. + metric_type: gauge + monitor: collectd/mysql + title: mysql_bpool_bytes.dirty + +mysql_bpool_counters.pages_flushed: + brief: The number of requests to flush pages from the InnoDB buffer pool + custom: true + description: The number of requests to flush pages from the InnoDB buffer pool. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_bpool_counters.pages_flushed + +mysql_bpool_counters.read_ahead: + brief: The number of pages read into the InnoDB buffer pool by the read-ahead background + thread + custom: true + description: The number of pages read into the InnoDB buffer pool by the read-ahead + background thread. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_bpool_counters.read_ahead + +mysql_bpool_counters.read_ahead_evicted: + brief: The number of pages read into the InnoDB buffer pool by the read-ahead background + thread that were subsequently evicted without having been accessed by queries + custom: true + description: The number of pages read into the InnoDB buffer pool by the read-ahead + background thread that were subsequently evicted without having been accessed + by queries. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_bpool_counters.read_ahead_evicted + +mysql_bpool_counters.read_ahead_rnd: + brief: "The number of \u201Crandom\u201D read-aheads initiated by InnoDB" + custom: true + description: "The number of \u201Crandom\u201D read-aheads initiated by InnoDB.\ + \ This happens when a query scans a large portion of a table but in random order." + metric_type: cumulative + monitor: collectd/mysql + title: mysql_bpool_counters.read_ahead_rnd + +mysql_bpool_counters.read_requests: + brief: The number of logical read requests + custom: true + description: The number of logical read requests. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_bpool_counters.read_requests + +mysql_bpool_counters.reads: + brief: The number of logical reads that InnoDB could not satisfy from the buffer + pool, and had to read directly from disk + custom: true + description: The number of logical reads that InnoDB could not satisfy from the + buffer pool, and had to read directly from disk. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_bpool_counters.reads + +mysql_bpool_counters.wait_free: + brief: Normally, writes to the InnoDB buffer pool happen in the background + custom: true + description: Normally, writes to the InnoDB buffer pool happen in the background. + When InnoDB needs to read or create a page and no clean pages are available, InnoDB + flushes some dirty pages first and waits for that operation to finish. This counter + counts instances of these waits. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_bpool_counters.wait_free + +mysql_bpool_counters.write_requests: + brief: The number of writes done to the InnoDB buffer pool + custom: true + description: The number of writes done to the InnoDB buffer pool. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_bpool_counters.write_requests + +mysql_bpool_pages.data: + brief: The number of pages in the InnoDB buffer pool containing data + custom: true + description: The number of pages in the InnoDB buffer pool containing data. The + number includes both dirty and clean pages. + metric_type: gauge + monitor: collectd/mysql + title: mysql_bpool_pages.data + +mysql_bpool_pages.dirty: + brief: The current number of dirty pages in the InnoDB buffer pool + custom: true + description: The current number of dirty pages in the InnoDB buffer pool. + metric_type: gauge + monitor: collectd/mysql + title: mysql_bpool_pages.dirty + +mysql_bpool_pages.free: + brief: The number of free pages in the InnoDB buffer pool + custom: true + description: The number of free pages in the InnoDB buffer pool. + metric_type: gauge + monitor: collectd/mysql + title: mysql_bpool_pages.free + +mysql_bpool_pages.misc: + brief: The number of pages in the InnoDB buffer pool that are busy because they + have been allocated for administrative overhead, such as row locks or the adaptive + hash index + custom: true + description: The number of pages in the InnoDB buffer pool that are busy because + they have been allocated for administrative overhead, such as row locks or the + adaptive hash index. + metric_type: gauge + monitor: collectd/mysql + title: mysql_bpool_pages.misc + +mysql_bpool_pages.total: + brief: The total size of the InnoDB buffer pool, in pages + custom: true + description: The total size of the InnoDB buffer pool, in pages. + metric_type: gauge + monitor: collectd/mysql + title: mysql_bpool_pages.total + mysql_commands.admin_commands: brief: The number of MySQL ADMIN commands executed custom: true @@ -1246,6 +1385,179 @@ mysql_handler.write: monitor: collectd/mysql title: mysql_handler.write +mysql_innodb_data.fsyncs: + brief: The number of fsync() operations so far + custom: true + description: The number of fsync() operations so far. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_data.fsyncs + +mysql_innodb_data.read: + brief: The amount of data read since the server was started (in bytes) + custom: true + description: The amount of data read since the server was started (in bytes). + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_data.read + +mysql_innodb_data.reads: + brief: The total number of data reads (OS file reads) + custom: true + description: The total number of data reads (OS file reads). + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_data.reads + +mysql_innodb_data.writes: + brief: The total number of data writes + custom: true + description: The total number of data writes. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_data.writes + +mysql_innodb_data.written: + brief: The amount of data written so far, in bytes + custom: true + description: The amount of data written so far, in bytes. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_data.written + +mysql_innodb_dblwr.writes: + brief: The number of doublewrite operations that have been performed + custom: true + description: The number of doublewrite operations that have been performed. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_dblwr.writes + +mysql_innodb_dblwr.written: + brief: The number of pages that have been written to the doublewrite buffer + custom: true + description: The number of pages that have been written to the doublewrite buffer. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_dblwr.written + +mysql_innodb_log.fsyncs: + brief: The number of fsync() writes done to the InnoDB redo log files + custom: true + description: The number of fsync() writes done to the InnoDB redo log files. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_log.fsyncs + +mysql_innodb_log.waits: + brief: The number of times that the log buffer was too small and a wait was required + for it to be flushed before continuing + custom: true + description: The number of times that the log buffer was too small and a wait was + required for it to be flushed before continuing. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_log.waits + +mysql_innodb_log.write_requests: + brief: The number of write requests for the InnoDB redo log + custom: true + description: The number of write requests for the InnoDB redo log. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_log.write_requests + +mysql_innodb_log.writes: + brief: The number of physical writes to the InnoDB redo log file + custom: true + description: The number of physical writes to the InnoDB redo log file. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_log.writes + +mysql_innodb_log.written: + brief: The number of bytes written to the InnoDB redo log files + custom: true + description: The number of bytes written to the InnoDB redo log files. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_log.written + +mysql_innodb_pages.created: + brief: The number of pages created by operations on InnoDB tables + custom: true + description: The number of pages created by operations on InnoDB tables. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_pages.created + +mysql_innodb_pages.read: + brief: The number of pages read from the InnoDB buffer pool by operations on InnoDB + tables + custom: true + description: The number of pages read from the InnoDB buffer pool by operations + on InnoDB tables. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_pages.read + +mysql_innodb_pages.written: + brief: The number of pages written by operations on InnoDB tables + custom: true + description: The number of pages written by operations on InnoDB tables. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_pages.written + +mysql_innodb_row_lock.time: + brief: The total time spent in acquiring row locks for InnoDB tables, in milliseconds + custom: true + description: The total time spent in acquiring row locks for InnoDB tables, in milliseconds. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_row_lock.time + +mysql_innodb_row_lock.waits: + brief: The number of times operations on InnoDB tables had to wait for a row lock + custom: true + description: The number of times operations on InnoDB tables had to wait for a row + lock. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_row_lock.waits + +mysql_innodb_rows.deleted: + brief: The number of rows deleted from InnoDB tables + custom: true + description: The number of rows deleted from InnoDB tables. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_rows.deleted + +mysql_innodb_rows.inserted: + brief: The number of rows inserted into InnoDB tables + custom: true + description: The number of rows inserted into InnoDB tables. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_rows.inserted + +mysql_innodb_rows.read: + brief: The number of rows read from InnoDB tables + custom: true + description: The number of rows read from InnoDB tables. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_rows.read + +mysql_innodb_rows.updated: + brief: The number of rows updated in InnoDB tables + custom: true + description: The number of rows updated in InnoDB tables. + metric_type: cumulative + monitor: collectd/mysql + title: mysql_innodb_rows.updated + mysql_locks.immediate: brief: The number of MySQL table locks which were granted immediately custom: false diff --git a/nginx/SMART_AGENT_MONITOR.md b/nginx/SMART_AGENT_MONITOR.md index 9a46fcb4d..26f6b073e 100644 --- a/nginx/SMART_AGENT_MONITOR.md +++ b/nginx/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/nginx`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/nginx`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -46,7 +46,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -92,15 +92,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/openstack/SMART_AGENT_MONITOR.md b/openstack/SMART_AGENT_MONITOR.md index 47dd23c17..a3314b843 100644 --- a/openstack/SMART_AGENT_MONITOR.md +++ b/openstack/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/openstack`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/openstack`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -51,6 +51,21 @@ monitors: authURL: "http://192.168.11.111/identity/v3" username: "admin" password: "secret" + requestBatchSize: 10 + novaListServersSearchOpts: + all_tenants: "TRUE" + status: "ACTIVE" +``` +### Example config using skipVerify and disabling querying server metrics +```yaml +monitors: +- type: collectd/openstack + authURL: "https://192.168.11.111/identity/v3" + username: "admin" + password: "secret" + skipVerify: true + queryServerMetrics: false + queryHypervisorMetrics: false ``` @@ -66,7 +81,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -77,7 +92,14 @@ Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monit | `password` | **yes** | `string` | Password to authenticate with keystone identity | | `projectName` | no | `string` | Specify the name of Project to be monitored (**default**:"demo") | | `projectDomainID` | no | `string` | The project domain (**default**:"default") | +| `regionName` | no | `string` | The region name for URL discovery, defaults to the first region if multiple regions are available. | | `userDomainID` | no | `string` | The user domain id (**default**:"default") | +| `skipVerify` | no | `bool` | Skip SSL certificate validation (**default:** `false`) | +| `httpTimeout` | no | `float64` | The HTTP client timeout in seconds for all requests (**default:** `0`) | +| `requestBatchSize` | no | `integer` | The maximum number of concurrent requests for each metric class (**default:** `5`) | +| `queryServerMetrics` | no | `bool` | Whether to query server metrics (useful to disable for TripleO Undercloud) (**default:** `true`) | +| `queryHypervisorMetrics` | no | `bool` | Whether to query hypervisor metrics (useful to disable for TripleO Undercloud) (**default:** `true`) | +| `novaListServersSearchOpts` | no | `map of strings` | Optional search_opts mapping for collectd-openstack Nova client servers.list(search_opts=novaListServerSearchOpts). For more information see https://docs.openstack.org/api-ref/compute/#list-servers. | ## Metrics @@ -159,16 +181,16 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. ## Dimensions diff --git a/postgresql/SMART_AGENT_MONITOR.md b/postgresql/SMART_AGENT_MONITOR.md index 574fb06bb..f2d394570 100644 --- a/postgresql/SMART_AGENT_MONITOR.md +++ b/postgresql/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `postgresql`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `postgresql`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -41,13 +41,23 @@ Here is a [sample configuration of Postgres to enable statement tracking](https: Tested with PostgreSQL `9.2+`. -If you want to collect additional metrics about PostgreSQL, use the [sql monitor](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./sql.md). +If you want to collect additional metrics about PostgreSQL, use the [sql monitor](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./sql.md). + +## Metrics about Replication + +Replication metrics could not be available on some PostgreSQL servers. For now, this monitor +automatically disable `replication` metrics group if it detects Aurora to avoid following error: + +> Function pg_last_xlog_receive_location() is currently not supported for Aurora + +The metric `postgres_replication_state` will only be reported for `master` and +`postgres_replication_lag` only for `standby` role (replica). ## Example Configuration This example uses the [Vault remote config -source](https://github.com/signalfx/signalfx-agent/blob/master/docs/remote-config.md#nested-values-vault-only) +source](https://github.com/signalfx/signalfx-agent/blob/main/docs/remote-config.md#nested-values-vault-only) to connect to PostgreSQL using the `params` map that allows you to pull out the username and password individually from Vault and interpolate them into the `connectionString` config option. @@ -89,7 +99,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -101,6 +111,7 @@ Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monit | `params` | no | `map of strings` | Parameters to the connection string that can be templated into the connection string with the syntax `{{.key}}`. | | `databases` | no | `list of strings` | List of databases to send database-specific metrics about. If omitted, metrics about all databases will be sent. This is an [overridable set](https://docs.splunk.com/observability/gdi/smart-agent/smart-agent-resources.html#filtering-data-using-the-smart-agent). (**default:** `[*]`) | | `databasePollIntervalSeconds` | no | `integer` | How frequently to poll for new/deleted databases in the DB server. Defaults to the same as `intervalSeconds` if not set. (**default:** `0`) | +| `logQueries` | no | `bool` | If true, queries will be logged at the info level. (**default:** `false`) | | `topQueryLimit` | no | `integer` | The number of top queries to consider when publishing query-related metrics (**default:** `10`) | @@ -113,10 +124,13 @@ Metrics that are categorized as These are the metrics available for this integration. - ***`postgres_block_hit_ratio`*** (*gauge*)
The proportion (between 0 and 1, inclusive) of block reads that used the cache and did not have to go to the disk. Is sent for `table`, `index`, and the `database` as a whole. + - `postgres_conflicts` (*cumulative*)
The number of conflicts. - ***`postgres_database_size`*** (*gauge*)
Size in bytes of the database on disk - ***`postgres_deadlocks`*** (*cumulative*)
Total number of deadlocks detected by the system - ***`postgres_index_scans`*** (*cumulative*)
Total number of index scans on the `table`. - ***`postgres_live_rows`*** (*gauge*)
Number of rows live (not deleted) in the `table`. + - `postgres_locks` (*gauge*)
The number of locks active. + - `postgres_pct_connections` (*gauge*)
The number of connections to this database as a fraction of the maximum number of allowed connections. - ***`postgres_query_count`*** (*cumulative*)
Total number of queries executed on the `database`, broken down by `user`. Note that the accuracy of this metric depends on the PostgreSQL [pg_stat_statements.max config option](https://www.postgresql.org/docs/9.3/pgstatstatements.html#AEN160631) being large enough to hold all queries. - ***`postgres_query_time`*** (*cumulative*)
Total time taken to execute queries on the `database`, broken down by `user`. Measured in ms unless otherwise indicated. @@ -124,9 +138,11 @@ These are the metrics available for this integration. - ***`postgres_rows_inserted`*** (*cumulative*)
Number of rows inserted into the `table`. - ***`postgres_rows_updated`*** (*cumulative*)
Number of rows updated in the `table`. - ***`postgres_sequential_scans`*** (*cumulative*)
Total number of sequential scans on the `table`. - - ***`postgres_sessions`*** (*gauge*)
Number of sessions currently on the server instance. The `state` dimension will specify which which type of session (see `state` row of [pg_stat_activity](https://www.postgresql.org/docs/9.2/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW)). + - ***`postgres_sessions`*** (*gauge*)
Number of sessions currently on the server instance. The `state` dimension will specify which type of session (see `state` row of [pg_stat_activity](https://www.postgresql.org/docs/9.2/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW)). - ***`postgres_table_size`*** (*gauge*)
The size in bytes of the `table` on disk. + - `postgres_xact_commits` (*cumulative*)
The number of transactions that have been committed in this database. + - `postgres_xact_rollbacks` (*cumulative*)
The number of transactions that have been rolled back in this database. #### Group queries All of the following metrics are part of the `queries` metric group. All of @@ -136,6 +152,13 @@ monitor config option `extraGroups`: - `postgres_queries_calls` (*cumulative*)
Top N most frequently executed queries broken down by `database` - `postgres_queries_total_time` (*cumulative*)
Top N queries based on the total execution time broken down by `database` +#### Group replication +All of the following metrics are part of the `replication` metric group. All of +the non-default metrics below can be turned on by adding `replication` to the +monitor config option `extraGroups`: + - `postgres_replication_lag` (*gauge*)
The current replication delay in seconds. Always = 0 on master. + - `postgres_replication_state` (*gauge*)
The current replication state. + ### Non-default metrics (version 4.7.0+) **The following information applies to the agent version 4.7.0+ that has @@ -151,16 +174,16 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. ## Dimensions @@ -174,7 +197,10 @@ dimensions may be specific to certain metrics. | --- | --- | | `database` | The name of the database within a PostgreSQL server to which the metric pertains. | | `index` | For index metrics, the name of the index | +| `replication_role` | For "replication_lag" metric only, could be "master" or "standby". | | `schemaname` | The name of the schema within which the object being monitored resides (e.g. `public`). | +| `slot_name` | For "replication_state" metric only, the name of replication slot. | +| `slot_type` | For "replication_state" metric only, the type of replication. | | `table` | The name of the table to which the metric pertains. | | `tablespace` | For table metrics, the tablespace in which the table belongs, if not null. | | `type` | Whether the object (table, index, function, etc.) belongs to the `system` or `user`. | diff --git a/postgresql/metrics.yaml b/postgresql/metrics.yaml index ae79134f2..8f37ceaa5 100644 --- a/postgresql/metrics.yaml +++ b/postgresql/metrics.yaml @@ -11,6 +11,14 @@ postgres_block_hit_ratio: monitor: postgresql title: postgres_block_hit_ratio +postgres_conflicts: + brief: The number of conflicts + custom: true + description: The number of conflicts. + metric_type: cumulative + monitor: postgresql + title: postgres_conflicts + postgres_database_size: brief: Size in bytes of the database on disk custom: false @@ -43,6 +51,24 @@ postgres_live_rows: monitor: postgresql title: postgres_live_rows +postgres_locks: + brief: The number of locks active + custom: true + description: The number of locks active. + metric_type: gauge + monitor: postgresql + title: postgres_locks + +postgres_pct_connections: + brief: The number of connections to this database as a fraction of the maximum number + of allowed connections + custom: true + description: The number of connections to this database as a fraction of the maximum + number of allowed connections. + metric_type: gauge + monitor: postgresql + title: postgres_pct_connections + postgres_queries_average_time: brief: Top N queries based on the average execution time broken down by `database` custom: true @@ -87,6 +113,22 @@ postgres_query_time: monitor: postgresql title: postgres_query_time +postgres_replication_lag: + brief: The current replication delay in seconds + custom: true + description: The current replication delay in seconds. Always = 0 on master. + metric_type: gauge + monitor: postgresql + title: postgres_replication_lag + +postgres_replication_state: + brief: The current replication state + custom: true + description: The current replication state. + metric_type: gauge + monitor: postgresql + title: postgres_replication_state + postgres_rows_deleted: brief: Number of rows deleted from the `table` custom: false @@ -123,7 +165,7 @@ postgres_sessions: brief: Number of sessions currently on the server instance custom: false description: Number of sessions currently on the server instance. The `state` dimension - will specify which which type of session (see `state` row of [pg_stat_activity](https://www.postgresql.org/docs/9.2/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW)). + will specify which type of session (see `state` row of [pg_stat_activity](https://www.postgresql.org/docs/9.2/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW)). metric_type: gauge monitor: postgresql title: postgres_sessions @@ -136,3 +178,19 @@ postgres_table_size: monitor: postgresql title: postgres_table_size +postgres_xact_commits: + brief: The number of transactions that have been committed in this database + custom: true + description: The number of transactions that have been committed in this database. + metric_type: cumulative + monitor: postgresql + title: postgres_xact_commits + +postgres_xact_rollbacks: + brief: The number of transactions that have been rolled back in this database + custom: true + description: The number of transactions that have been rolled back in this database. + metric_type: cumulative + monitor: postgresql + title: postgres_xact_rollbacks + diff --git a/rabbitmq/SMART_AGENT_MONITOR.md b/rabbitmq/SMART_AGENT_MONITOR.md index df01f513f..aecad061d 100644 --- a/rabbitmq/SMART_AGENT_MONITOR.md +++ b/rabbitmq/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/rabbitmq`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/rabbitmq`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -42,7 +42,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -60,6 +60,12 @@ Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monit | `verbosityLevel` | no | `string` | | | `username` | **yes** | `string` | | | `password` | **yes** | `string` | | +| `useHTTPS` | no | `bool` | Whether to enable HTTPS. (**default:** `false`) | +| `sslCACertFile` | no | `string` | Path to SSL/TLS certificates file of root Certificate Authorities implicitly trusted by this monitor. | +| `sslCertFile` | no | `string` | Path to this monitor's own SSL/TLS certificate. | +| `sslKeyFile` | no | `string` | Path to this monitor's private SSL/TLS key file. | +| `sslKeyPassphrase` | no | `string` | This monitor's private SSL/TLS key file password if any. | +| `sslVerify` | no | `bool` | Should the monitor verify the RabbitMQ Management plugin SSL/TLS certificate. (**default:** `false`) | ## Metrics @@ -219,15 +225,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/redis/SMART_AGENT_MONITOR.md b/redis/SMART_AGENT_MONITOR.md index 429d24170..9f99149c8 100644 --- a/redis/SMART_AGENT_MONITOR.md +++ b/redis/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/redis`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/redis`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -26,12 +26,12 @@ You can capture any kind of Redis metrics like: * Memory used * Commands processed per second - * Number of connected clients and slaves + * Number of connected clients and followers * Number of blocked clients * Number of keys stored (per database) * Uptime * Changes since last save - * Replication delay (per slave) + * Replication delay (per follower) @@ -54,7 +54,7 @@ match something that is very big, as this command is not highly optimized and can block other commands from executing. Note: To avoid duplication reporting, this should only be reported in one node. -Keys can be defined in either the master or slave config. +Keys can be defined in either the leader or follower config. Sample YAML configuration with list lengths: @@ -91,7 +91,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -102,6 +102,7 @@ Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monit | `name` | no | `string` | The name for the node is a canonical identifier which is used as plugin instance. It is limited to 64 characters in length. (**default**: "{host}:{port}") | | `auth` | no | `string` | Password to use for authentication. | | `sendListLengths` | no | `list of objects (see below)` | Specify a pattern of keys to lists for which to send their length as a metric. See below for more details. | +| `verbose` | no | `bool` | If `true`, verbose logging from the plugin will be enabled. (**default:** `false`) | The **nested** `sendListLengths` config object has the following fields: @@ -120,6 +121,8 @@ Metrics that are categorized as These are the metrics available for this integration. + - `bytes.maxmemory` (*gauge*)
Maximum memory configured on Redis server + - `bytes.total_system_memory` (*gauge*)
Total memory available on the OS - ***`bytes.used_memory`*** (*gauge*)
Number of bytes allocated by Redis - `bytes.used_memory_lua` (*gauge*)
Number of bytes used by the Lua engine - `bytes.used_memory_peak` (*gauge*)
Peak Number of bytes allocated by Redis @@ -142,18 +145,21 @@ These are the metrics available for this integration. - `gauge.changes_since_last_save` (*gauge*)
Number of changes since the last dump - `gauge.client_biggest_input_buf` (*gauge*)
Biggest input buffer among current client connections - `gauge.client_longest_output_list` (*gauge*)
Longest output list among current client connections - - ***`gauge.connected_clients`*** (*gauge*)
Number of client connections (excluding connections from slaves) - - `gauge.connected_slaves` (*gauge*)
Number of connected slaves + - ***`gauge.connected_clients`*** (*gauge*)
Number of client connections (excluding connections from followers) + - `gauge.connected_slaves` (*gauge*)
Number of connected followers - `gauge.db0_avg_ttl` (*gauge*)
The average time to live for all keys in redis - `gauge.db0_expires` (*gauge*)
The total number of keys in redis that will expire - `gauge.db0_keys` (*gauge*)
The total number of keys stored in redis - `gauge.instantaneous_ops_per_sec` (*gauge*)
Number of commands processed per second - `gauge.key_llen` (*gauge*)
Length of an list key - `gauge.latest_fork_usec` (*gauge*)
Duration of the latest fork operation in microseconds - - `gauge.master_last_io_seconds_ago` (*gauge*)
Number of seconds since the last interaction with master + - `gauge.master_last_io_seconds_ago` (*gauge*)
Number of seconds since the last interaction with leader + - `gauge.master_link_down_since_seconds` (*gauge*)
Number of seconds since the link is down + - `gauge.master_link_status` (*gauge*)
Status of the link (up/down) - ***`gauge.master_repl_offset`*** (*gauge*)
Master replication offset - `gauge.mem_fragmentation_ratio` (*gauge*)
Ratio between used_memory_rss and used_memory - `gauge.rdb_bgsave_in_progress` (*gauge*)
Flag indicating a RDB save is on-going + - `gauge.rdb_last_save_time` (*gauge*)
Unix timestamp for last save to disk, when using persistence - `gauge.repl_backlog_first_byte_offset` (*gauge*)
Slave replication backlog offset - ***`gauge.slave_repl_offset`*** (*gauge*)
Slave replication offset - `gauge.uptime_in_days` (*gauge*)
Number of days up @@ -174,16 +180,16 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. ## Dimensions diff --git a/redis/metrics.yaml b/redis/metrics.yaml index 38ffd944f..7919aec07 100644 --- a/redis/metrics.yaml +++ b/redis/metrics.yaml @@ -1,5 +1,21 @@ # This file was generated in the Smart Agent repo and copied here, DO NOT EDIT HERE. +bytes.maxmemory: + brief: Maximum memory configured on Redis server + custom: true + description: Maximum memory configured on Redis server + metric_type: gauge + monitor: collectd/redis + title: bytes.maxmemory + +bytes.total_system_memory: + brief: Total memory available on the OS + custom: true + description: Total memory available on the OS + metric_type: gauge + monitor: collectd/redis + title: bytes.total_system_memory + bytes.used_memory: brief: Number of bytes allocated by Redis custom: false @@ -177,17 +193,17 @@ gauge.client_longest_output_list: title: gauge.client_longest_output_list gauge.connected_clients: - brief: Number of client connections (excluding connections from slaves) + brief: Number of client connections (excluding connections from followers) custom: false - description: Number of client connections (excluding connections from slaves) + description: Number of client connections (excluding connections from followers) metric_type: gauge monitor: collectd/redis title: gauge.connected_clients gauge.connected_slaves: - brief: Number of connected slaves + brief: Number of connected followers custom: true - description: Number of connected slaves + description: Number of connected followers metric_type: gauge monitor: collectd/redis title: gauge.connected_slaves @@ -241,13 +257,29 @@ gauge.latest_fork_usec: title: gauge.latest_fork_usec gauge.master_last_io_seconds_ago: - brief: Number of seconds since the last interaction with master + brief: Number of seconds since the last interaction with leader custom: true - description: Number of seconds since the last interaction with master + description: Number of seconds since the last interaction with leader metric_type: gauge monitor: collectd/redis title: gauge.master_last_io_seconds_ago +gauge.master_link_down_since_seconds: + brief: Number of seconds since the link is down + custom: true + description: Number of seconds since the link is down + metric_type: gauge + monitor: collectd/redis + title: gauge.master_link_down_since_seconds + +gauge.master_link_status: + brief: Status of the link (up/down) + custom: true + description: Status of the link (up/down) + metric_type: gauge + monitor: collectd/redis + title: gauge.master_link_status + gauge.master_repl_offset: brief: Master replication offset custom: false @@ -272,6 +304,14 @@ gauge.rdb_bgsave_in_progress: monitor: collectd/redis title: gauge.rdb_bgsave_in_progress +gauge.rdb_last_save_time: + brief: Unix timestamp for last save to disk, when using persistence + custom: true + description: Unix timestamp for last save to disk, when using persistence + metric_type: gauge + monitor: collectd/redis + title: gauge.rdb_last_save_time + gauge.repl_backlog_first_byte_offset: brief: Slave replication backlog offset custom: true diff --git a/requirements.txt b/requirements.txt index a341315b6..e13f66343 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,3 +1,3 @@ MarkupSafe==2.0.1 Jinja2==2.11.3 -PyYAML==6.0.1 +PyYAML==5.4 diff --git a/signalfx-agent/README.md b/signalfx-agent/README.md index 583ed542f..39b89864f 100644 --- a/signalfx-agent/README.md +++ b/signalfx-agent/README.md @@ -6,117 +6,203 @@ ## +SignalFx Smart Agent is deprecated. For details, see the [Deprecation Notice](https://docs.signalfx.com/en/latest/integrations/agent/./smartagent-deprecation-notice.html). -The SignalFx Smart Agent is a metric-based agent written in Go that is used to monitor infrastructure and application services from a variety of environments. +SignalFx Smart Agent Integration installs the Smart Agent application on a single host machine from which you want to collect monitoring data. Smart Agent collects infrastructure monitoring, µAPM, and Kubernetes data. For other installation options, including bulk deployments to production, see [Install and Configure the Smart Agent](https://docs.splunk.com/observability/gdi/smart-agent/smart-agent-resources.html#install-the-smart-agent). ## Installation -### Review pre-installation requirements for the Smart Agent +### Prerequisites -Before you download and install the Smart Agent on a **single** host, review the requirements below. +#### General +- Ensure that you've installed the applications and services you want to monitor on a Linux or Windows host. SignalFx doesn't support installing the Smart Agent on macOS or any other OS besides Linux and Windows. +- Uninstall or disable any previously-installed collector agents from your host, such as `collectd`. +- If you have any questions about compatibility between the Smart Agent and your host machine or its applications and services, contact your Splunk support representative. -(For other installation options, including bulk deployments, see [Advanced Installation Options](https://docs.signalfx.com/en/latest/integrations/agent/./advanced-install-options.html).) - -Please note that the Smart Agent does not support Mac OS. +#### Linux +- Ensure that you have access to `terminal` or a similar command line interface application. +- Ensure that your Linux username has permission to run the following commands: + - `curl` + - `sudo` +- Ensure that your machine is running Linux kernel version 3.2 or higher. -**General requirements** -- You must have access to your command line interface. -- You must uninstall or disable any previously installed collector agent from your host, such as collectd. +#### Windows +- Ensure that you have access to Windows PowerShell 6. +- Ensure that your machine is running Windows 8 or higher. +- Ensure that .Net Framework 3.5 or higher is installed. +- While SignalFx recommends that you use TLS 1.2, if you use TLS 1.0 and want to continue using TLS 1.0, then: + - Ensure that you support the following ciphers: + - TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA (secp256r1) - A + - TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA (secp256r1) - A + - TLS_RSA_WITH_AES_256_CBC_SHA (rsa 2048) - A + - TLS_RSA_WITH_AES_128_CBC_SHA (rsa 2048) - A + - TLS_RSA_WITH_3DES_EDE_CBC_SHA (rsa 2048) - C + - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (secp256r1) - A + - TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 (secp256r1) - A + - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (secp256r1) - A + - TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384 (secp256r1) - A + - TLS_RSA_WITH_AES_128_GCM_SHA256 (rsa 2048) - A + - TLS_RSA_WITH_AES_256_GCM_SHA384 (rsa 2048) - A + - TLS_RSA_WITH_AES_128_CBC_SHA256 (rsa 2048) - A + - See [Solving the TLS 1.0 Problem, 2nd Edition](https://docs.microsoft.com/en-us/security/engineering/solving-tls1-problem) for more information. -**Linux requirements** -- You must run kernel version 2.6 or higher for your Linux distribution. +### Steps -**Windows requirements** -- You must run .Net Framework 3.5 on Windows 8 or higher. -- You must run Visual C++ Compiler for Python 2.7. +#### Access the SignalFx UI +This content appears in both the documentation site and in the SignalFx UI. -### Step 1. Install the SignalFx Smart Agent on your host +If you are reading this content from the documentation site, please access the SignalFx UI so that you can paste pre-populated commands. -#### Linux +To access this content from the SignalFx UI: +1. In the SignalFx UI, in the top menu, click **Integrations**. +2. Locate and select **SignalFx SmartAgent**. +3. Click **Setup**, and continue reading the instructions. -Note: This content appears on a SignalFx documentation page and on the **Setup** tab of the Smart Agent tile in the SignalFx UI. The following code to install the current version works only if you are viewing these instructions on the **Setup** tab. +#### Install the Smart Agent on Linux -From the **Setup** tab, copy and paste the following code into your command line: +This section lists the steps for installing the Smart Agent on Linux. If you want to install the Smart Agent on Windows, proceed to the next section, **Install SignalFx Smart Agent on Windows**. +Copy and paste the following code into your command line or terminal: ```sh -curl -sSL https://dl.signalfx.com/signalfx-agent.sh > /tmp/signalfx-agent.sh +curl -sSL https://dl.signalfx.com/signalfx-agent.sh > /tmp/signalfx-agent.sh; sudo sh /tmp/signalfx-agent.sh --realm YOUR_SIGNALFX_REALM -- YOUR_SIGNALFX_API_TOKEN ``` +When this command finishes, it displays the following: +``` +The SignalFx Agent has been successfully installed. -#### Windows +Make sure that your system's time is relatively accurate or else datapoints may not be accepted. -Note: This content appears on a SignalFx documentation page and on the **Setup** tab of the Smart Agent tile in the SignalFx UI. The following code to install the current version works only if you are viewing these instructions on the **Setup** tab. +The agent's main configuration file is located at /etc/signalfx/agent.yaml. +``` + +If your installation succeeds, proceed to the section **Verify Your Installation**. Otherwise, see the section **Troubleshoot Your Installation**. -From the **Setup** tab, copy and paste the following code into your command line: +#### Install the Smart Agent on Windows +Copy and paste the following code into your Windows PowerShell terminal: ```sh -& {Set-ExecutionPolicy Bypass -Scope Process -Force; $script = ((New-Object System.Net.WebClient).DownloadString('https://dl.signalfx.com/signalfx-agent.ps1')); $params = @{access_token = "YOUR_SIGNALFX_API_TOKEN"; ingest_url = "https://ingest.YOUR_SIGNALFX_REALM.signalfx.com"; api_url = "https://api.YOUR_SIGNALFX_REALM.signalfx.com"}; Invoke-Command -ScriptBlock ([scriptblock]::Create(". {$script} $(&{$args} @params)"))} +& {Set-ExecutionPolicy Bypass -Scope Process -Force; $script = ((New-Object System.Net.WebClient).DownloadString('https://dl.signalfx.com/signalfx-agent.ps1')); $params = @{access_token = "YOUR_SIGNALFX_API_TOKEN"; ingest_url = "https://ingest.YOUR_SIGNALFX_REALM.signalfx.com"; api_url = "https://api.YOUR_SIGNALFX_REALM.signalfx.com"}; Invoke-Command -ScriptBlock ([scriptblock]::Create(”. {$script} $(&{$args} @params)”))} ``` -The agent will be installed as a Windows service and will log to the Windows Event Log. +The agent files are installed to `\Program Files\SignalFx\SignalFxAgent`, and the default configuration file is installed at `\ProgramData\SignalFxAgent\agent.yaml` if it does not already exist. +The install script starts the agent as a Windows service that writes messages to the Windows Event Log. -### Step 2. Confirm your Installation +If your installation succeeds, proceed to the section **Verify Your Installation**. Otherwise, see the section **Troubleshoot Your Installation**. +### Verify Your Installation -1. To confirm your installation, enter the following command on the Linux or Windows command line: +1. To verify that you've successfully installed the Smart Agent, copy and paste the following command into your terminal. - ```sh - sudo signalfx-agent status - ``` +**For Linux:** - The return should be similar to the following example: +```sh +sudo signalfx-agent status +``` - ```sh - SignalFx Agent version: 4.7.6 - Agent uptime: 8m44s - Observers active: host - Active Monitors: 16 - Configured Monitors: 33 - Discovered Endpoint Count: 6 - Bad Monitor Config: None - Global Dimensions: {host: my-host-1} - Datapoints sent (last minute): 1614 - Events Sent (last minute): 0 - Trace Spans Sent (last minute): 0 - ``` +**For Windows:** -2. To confirm your installation, enter the following command on the Linux or Windows command line: +```sh +& ”\Program Files\SignalFx\SignalFxAgent\bin\signalfx-agent.exe” status +``` - | Command | Description | - |---|---| - | signalfx-agent status config | This command shows resolved config in use by the Smart Agent. | - | signalfx-agent status endpoints | This command shows discovered endpoints. | - | signalfx-agent status monitors | This command shows active monitors. | - | signalfx-agent status all | This command shows all of the above statuses. | +The command displays output that is similar to the following: +```sh +SignalFx Agent version: 5.1.0 +Agent uptime: 8m44s +Observers active: host +Active Monitors: 16 +Configured Monitors: 33 +Discovered Endpoint Count: 6 +Bad Monitor Config: None +Global Dimensions: {host: my-host-1} +Datapoints sent (last minute): 1614 +Events Sent (last minute): 0 +Trace Spans Sent (last minute): 0 +``` + +2. To perform additional verification, you can run any of the following commands: -### Troubleshoot the Smart Agent installation +- Display the current Smart Agent configuration. -If you are unable to install the Smart Agent, consider reviewing your error logs: +```sh +sudo signalfx-agent status config +``` -For Linux, use the following command to view error logs via Journal: +- Show endpoints discovered by the Smart Agent. ```sh -journalctl -u signalfx-agent | tail -100 +sudo signalfx-agent status endpoints ``` -For Windows, review the event logs. +- Show the Smart Agent's active monitors. These plugins poll apps and services to retrieve data. + +```sh +sudo signalfx-agent status monitors +``` + +### Troubleshoot Smart Agent Installation +If the Smart Agent installation fails, use the following procedures to gather troubleshooting information. + +#### General troubleshooting +To learn how to review signalfx-agent logs, see [Frequently Asked Questions](https://docs.signalfx.com/en/latest/integrations/agent/./faq.html). + +#### Linux troubleshooting -For additional installation troubleshooting information, including how to review logs, see [Frequently Asked Questions](https://docs.signalfx.com/en/latest/integrations/agent/./faq.html). +To view recent error logs, run the following command in terminal or a similar application: -### Review additional documentation +- For sysv/upstart hosts, run: -After a successful installation, learn more about the SignalFx agent and the SignalFx UI. +```sh +tail -f /var/log/signalfx-agent.log +``` + +- For systemd hosts, run: + +```sh +sudo journalctl -u signalfx-agent -f +``` + +#### Windows troubleshooting +Open **Administrative Tools > Event Viewer > Windows Logs > Application** to view the `signalfx-agent` error logs. + +### Uninstall the Smart Agent + +#### Debian + +To uninstall the Smart Agent on Debian-based distributions, run the following +command: + +```sh +sudo dpkg --remove signalfx-agent +``` + +**Note:** Configuration files may persist in `/etc/signalfx`. + +#### RPM + +To uninstall the Smart Agent on RPM-based distributions, run the following +command: + +```sh +sudo rpm -e signalfx-agent +``` + +**Note:** Configuration files may persist in `/etc/signalfx`. + +#### Windows -* Review the capabilities of the SignalFx Smart Agent. See [Advanced Installation Options](https://docs.signalfx.com/en/latest/integrations/agent/./advanced-install-options.html). +The Smart Agent can be uninstalled from `Programs and Features` in the Windows +Control Panel. -* Learn how data is displayed in the SignalFx UI. See [View infrastructure status](https://docs.signalfx.com/en/latest/getting-started/quick-start.html#step-3-view-infrastructure-status). +**Note:** Configuration files may persist in `\ProgramData\SignalFxAgent`. diff --git a/signalfx-agent/agent_docs/auto-discovery.md b/signalfx-agent/agent_docs/auto-discovery.md index 0426d923c..bd688be0f 100644 --- a/signalfx-agent/agent_docs/auto-discovery.md +++ b/signalfx-agent/agent_docs/auto-discovery.md @@ -1,13 +1,15 @@ -# Endpoint Discovery +# Service Discovery -The observers are responsible for discovering service endpoints. For these -service endpoints to result in a new monitor instance that is watching that -endpoint, you must apply _discovery rules_ to your monitor configuration. Every -monitor that supports monitoring specific services (i.e. not a static monitor -like the `collectd/cpu` monitor) can be configured with a `discoveryRule` -config option that specifies a rule using a mini rule language. +The observers are responsible for discovering monitoring _targets_. Examples of +_targets_ are an open TCP port on a container, or a Kubernetes Node, or a +listening TCP socket on localhost. For these discovered targets to result in a +new monitor instance that is monitoring that target, you must apply _discovery +rules_ to your monitor configuration. Every monitor that supports monitoring +specific services (i.e. not a static monitor like the `cpu` monitor) can be +configured with a `discoveryRule` config option that specifies a rule using a +mini rule language. For example, to monitor a Redis instance that has been discovered by a container-based observer, you could use the following configuration: @@ -18,10 +20,33 @@ container-based observer, you could use the following configuration: discoveryRule: container_image =~ "redis" && port == 6379 ``` +## Target Types + +There are currently several target types that can be discovered by the various +observers in the agent. You can match these target types explicitly in your +discovery rules with the expression `target == `, where `` are: + + - `pod`: A Kubernetes pod as a whole. The `host` field will be populated with + the Pod IP address, but no specific `port`. + + - `hostport`: A host and port combination, i.e. a network endpoint. This + endpoint could be on any type of runtime, e.g. container, remote host, + running on same host with no container. This type of target will always + have `host`, `port`, and `port_type` fields set. + + - `container`: A container, e.g. a Docker container. The `host` field will be + populated with the container IP address, but no `port` will be specified. + + - `k8s-node`: A Kubernetes Node -- the `host` field will be populated with the + Node's Internal DNS name or IP address. + +You don't have to specify the `target` in your discovery rules, but it can help +to prevent ambiguity. + ## Rule DSL -A rule is an expression that is matched against each discovered endpoint to -determine if a monitor should be active for a particular endpoint. The basic +A rule is an expression that is matched against each discovered target to +determine if a monitor should be active for a particular target. The basic operators are: | Operator | Description | @@ -35,59 +60,53 @@ operators are: | =~ | Regex matches | | !~ | Regex does not match | | && | And | -| \|\| | Or | +| \|\| | Or | For all available operators, see the govaluate documentation. +href="https://github.com/antonmedv/expr/blob/v1.8.5/docs/Language-Definition.md">the +expr language definition (this is what the agent uses under the covers). +We have a shim set of logic that lets you use the `=~` operator even though it +is not actually part of the expr language -- mainly to preserve backwards +compatibility with older agent releases before expr was used. The variables available in the expression are dependent on which observer you -are using. The following three variables are common to all observers: +are using and the type of target(s) it is producing. See the individual +observer docs for details on the variables available. - - `host` (string): The hostname or IP address of the discovered endpoint - - `port` (integer): The port number of the discovered endpoint - - `port_type` (`UDP` or `TCP`): Whether the port is TCP or UDP - -For a list of observers and the discovery rule variables they provide, see [Observers](./observer-config.md). +For a list of observers and the discovery rule variables they provide, see +[Observers](./observer-config.md). ### Additional Functions In addition, these extra functions are provided: - - `Get(map, key)` - returns the value from map if the given key is found, otherwise nil - - ```yaml - discoveryRule: Get(container_labels, "mapKey") == "mapValue" - ``` - - `Get` accepts an optional third argument that will act as a default value in - the case that the key is not present in the input map. - - - `Contains(map, key)` - returns true if key is inside map, otherwise false - - ```yaml - discoveryRule: Contains(container_labels, "mapKey") - ``` + - `Sprintf(format, args...)`: This is your typical printf style function that + can be used to compose strings from a complex set of disparately typed + variables. Underneath this is the Golang [fmt.Sprintf + function](https://golang.org/pkg/fmt/#example_Sprintf). + - `Getenv(envvar)`: Gets an environment variable set on the agent process, or + a blank string if the specified envvar is not set. There are no implicit rules built into the agent, so each rule must be specified manually in the config file, in conjunction with the monitor that should monitor the -discovered service. +discovered target. ## Endpoint Config Mapping -Sometimes it might be useful to use certain attributes of a discovered -endpoint (see [Endpoint Discovery](#endpoint-discovery)). These discovered -endpoints are created by [observers](./observer-config.md) and will usually -contain a full set of metadata that the observer obtains coincidently when it -is doing discovery (e.g. container labels). This metadata can be mapped -directly to monitor configuration for the monitor that is instantiated for that -endpoint. +Sometimes it might be useful to use certain attributes of a discovered target. +These discovered targets are created by [observers](./observer-config.md) and +will usually contain a full set of metadata that the observer obtains +coincidently when it is doing discovery (e.g. container labels). This metadata +can be mapped directly to monitor configuration for the monitor that is +instantiated for that target. To do this, you can set the [configEndpointMappings option](./monitor-config.md) -on a monitor config block. For example, the `collectd/kafka` monitor has -the `clusterName` config option, which is an arbirary value used to group -together broker instances. You could derive this from the `cluster` container -label on the kafka container instances like this: +on a monitor config block (_endpoint_ was the old name for _target_). For +example, the `collectd/kafka` monitor has the `clusterName` config option, +which is an arbirary value used to group together broker instances. You could +derive this from the `cluster` container label on the kafka container instances +like this: ```yaml monitors: @@ -109,18 +128,18 @@ While service discovery is useful, sometimes it is just easier to manually define services to monitor. This can be done by setting the `host` and `port` option in a monitor's config to the host and port that you need to monitor. These two values are the core of what the auto-discovery mechanism -provides for you automatically. +often provides for you automatically. For example (making use of YAML references to reduce repetition): ```yaml - &es - type: collectd/elasticsearch + type: elasticsearch username: admin password: s3cr3t host: es port: 9200 - + - <<: *es host: es2 port: 9300 @@ -132,5 +151,5 @@ connect to both of them. If you needed different configuration for the two ES hosts, you could simply define two monitor configurations, each with one service endpoint. -It is invalid to have both manually defined service endpoints and a discovery rule -on a single monitor configuration. +**It is invalid to have both manually defined service endpoints and a discovery rule +on a single monitor configuration.** diff --git a/signalfx-agent/agent_docs/config-schema.md b/signalfx-agent/agent_docs/config-schema.md index ebf819d70..dbbdfe7b7 100644 --- a/signalfx-agent/agent_docs/config-schema.md +++ b/signalfx-agent/agent_docs/config-schema.md @@ -49,18 +49,19 @@ if not set. | `useFullyQualifiedHost` | no | bool | If true (the default), and the `hostname` option is not set, the hostname will be determined by doing a reverse DNS query on the IP address that is returned by querying for the bare hostname. This is useful in cases where the hostname reported by the kernel is a short name. (**default**: `true`) | | `disableHostDimensions` | no | bool | Our standard agent model is to collect metrics for services running on the same host as the agent. Therefore, host-specific dimensions (e.g. `host`, `AWSUniqueId`, etc) are automatically added to every datapoint that is emitted from the agent by default. Set this to true if you are using the agent primarily to monitor things on other hosts. You can set this option at the monitor level as well. (**default:** `false`) | | `intervalSeconds` | no | integer | How often to send metrics to SignalFx. Monitors can override this individually. (**default:** `10`) | +| `cloudMetadataTimeout` | no | int64 | This flag sets the HTTP timeout duration for metadata queries from AWS, Azure and GCP. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `"2s"`) | | `globalDimensions` | no | map of strings | Dimensions (key:value pairs) that will be added to every datapoint emitted by the agent. To specify that all metrics should be high-resolution, add the dimension `sf_hires: 1` | -| `sendMachineID` | no | bool | Whether to send the machine-id dimension on all host-specific datapoints generated by the agent. This dimension is derived from the Linux machine-id value. (**default:** `false`) | +| `globalSpanTags` | no | map of strings | Tags (key:value pairs) that will be added to every span emitted by the agent. | | `cluster` | no | string | The logical environment/cluster that this agent instance is running in. All of the services that this instance monitors should be in the same environment as well. This value, if provided, will be synced as a property onto the `host` dimension, or onto any cloud-provided specific dimensions (`AWSUniqueId`, `gcp_id`, and `azure_resource_id`) when available. Example values: "prod-usa", "dev" | | `syncClusterOnHostDimension` | no | bool | If true, force syncing of the `cluster` property on the `host` dimension, even when cloud-specific dimensions are present. (**default:** `false`) | -| `validateDiscoveryRules` | no | bool | If true, a warning will be emitted if a discovery rule contains variables that will never possibly match a rule. If using multiple observers, it is convenient to set this to false to suppress spurious errors. (**default:** `true`) | +| `validateDiscoveryRules` | no | bool | If true, a warning will be emitted if a discovery rule contains variables that will never possibly match a rule. If using multiple observers, it is convenient to set this to false to suppress spurious errors. (**default:** `false`) | | `observers` | no | [list of objects (see below)](#observers) | A list of observers to use (see observer config) | | `monitors` | no | [list of objects (see below)](#monitors) | A list of monitors to use (see monitor config) | | `writer` | no | [object (see below)](#writer) | Configuration of the datapoint/event writer | | `logging` | no | [object (see below)](#logging) | Log configuration | | `collectd` | no | [object (see below)](#collectd) | Configuration of the managed collectd subprocess | -| `enableBuiltInFiltering` | no | bool | If true, the agent will filter out [custom metrics](https://docs.signalfx.com/en/latest/admin-guide/usage.html#about-custom-bundled-and-high-resolution-metrics) without having to rely on the `whitelist.json` filter that was previously configured under `metricsToExclude`. Whether a metric is custom or not is documented in each monitor's documentation. If `true`, every monitor's default configuration (i.e. the minimum amount of configuration to make it work) will only send non-custom metrics. In order to send out custom metrics from a monitor, certain config flags on the monitor must be set _or_ you can specify the metric in the `extraMetrics` config option on each monitor if you know the specific metric name. You would not have to modify the whitelist via `metricsToInclude` as before. If you set this option to `true`, the `whitelist.json` entry under `metricToExclude` should be removed, if it is present -- otherwise custom metrics won't be emitted. (**default:** `false`) | -| `metricsToInclude` | no | [list of objects (see below)](#metricstoinclude) | A list of metric filters that will whitelist/include metrics. These filters take priority over the filters specified in `metricsToExclude`. | +| `enableBuiltInFiltering` | no | bool | This must be unset or explicitly set to true. In prior versions of the agent, there was a filtering mechanism that relied heavily on an external whitelist.json file to determine which metrics were sent by default. This is all inherent to the agent now and the old style of filtering is no longer available. (**default:** `true`) | +| `metricsToInclude` | no | [list of objects (see below)](#metricstoinclude) | A list of metric filters that will include metrics. These filters take priority over the filters specified in `metricsToExclude`. | | `metricsToExclude` | no | [list of objects (see below)](#metricstoexclude) | A list of metric filters | | `propertiesToExclude` | no | [list of objects (see below)](#propertiestoexclude) | A list of properties filters | | `internalStatusHost` | no | string | The host on which the internal status server will listen. The internal status HTTP server serves internal metrics and diagnostic information about the agent and can be scraped by the `internal-metrics` monitor. Can be set to `0.0.0.0` if you want to monitor the agent from another host. If you set this to blank/null, the internal status server will not be started. See `internalStatusPort`. (**default:** `"localhost"`) | @@ -108,31 +109,15 @@ The following are generic options that apply to all monitors. Each monitor type | `configEndpointMappings` | no | map of strings | A set of mappings from a configuration option on this monitor to attributes of a discovered endpoint. The keys are the config option on this monitor and the value can be any valid expression used in discovery rules. | | `intervalSeconds` | no | integer | The interval (in seconds) at which to emit datapoints from the monitor(s) created by this configuration. If not set (or set to 0), the global agent intervalSeconds config option will be used instead. (**default:** `0`) | | `solo` | no | bool | If one or more configurations have this set to true, only those configurations will be considered. This setting can be useful for testing. (**default:** `false`) | -| `metricsToExclude` | no | [list of objects (see below)](#metricstoexclude) | DEPRECATED in favor of the `datapointsToExclude` option. That option handles negation of filter items differently. | | `datapointsToExclude` | no | [list of objects (see below)](#datapointstoexclude) | A list of datapoint filters. These filters allow you to comprehensively define which datapoints to exclude by metric name or dimension set, as well as the ability to define overrides to re-include metrics excluded by previous patterns within the same filter item. See [monitor filtering](./filtering.html#additional-monitor-level-filtering) for examples and more information. | | `disableHostDimensions` | no | bool | Some monitors pull metrics from services not running on the same host and should not get the host-specific dimensions set on them (e.g. `host`, `AWSUniqueId`, etc). Setting this to `true` causes those dimensions to be omitted. You can disable this globally with the `disableHostDimensions` option on the top level of the config. (**default:** `false`) | | `disableEndpointDimensions` | no | bool | This can be set to true if you don't want to include the dimensions that are specific to the endpoint that was discovered by an observer. This is useful when you have an endpoint whose identity is not particularly important since it acts largely as a proxy or adapter for other metrics. (**default:** `false`) | +| `metricNameTransformations` | no | map | A map from _original_ metric name to a replacement value. The keys are intepreted as regular expressions and the values can contain backreferences. This means that you should escape any RE characters in the original metric name with `\` (the most common escape necessary will be `\.` as period is interpreted as "all characters" if unescaped). The [Go regexp language](https://github.com/google/re2/wiki/Syntax), and backreferences are of the form `$1`. If there are multiple entries in list of maps, they will each be run in sequence, using the transformation from the previous entry as the input the subsequent transformation. To add a common prefix to all metrics coming out of a monitor, use a mapping like this: `(.*): myprefix.$1` | | `dimensionTransformations` | no | map of strings | A map from dimension names emitted by the monitor to the desired dimension name that will be emitted in the datapoint that goes to SignalFx. This can be useful if you have custom metrics from your applications and want to make the dimensions from a monitor match those. Also can be useful when scraping free-form metrics, say with the `prometheus-exporter` monitor. Right now, only static key/value transformations are supported. Note that filtering by dimensions will be done on the *original* dimension name and not the new name. Note that it is possible to remove unwanted dimensions via this configuration, by making the desired dimension name an empty string. | | `extraMetrics` | no | list of strings | Extra metrics to enable besides the default included ones. This is an [overridable filter](https://docs.splunk.com/observability/gdi/smart-agent/smart-agent-resources.html#filtering-data-using-the-smart-agent). | | `extraGroups` | no | list of strings | Extra metric groups to enable in addition to the metrics that are emitted by default. A metric group is simply a collection of metrics, and they are defined in each monitor's documentation. | -## metricsToExclude -The **nested** `metricsToExclude` config object has the following fields: - -For more information on filtering see [Datapoint Filtering](./filtering.md). - - -| Config option | Required | Type | Description | -| --- | --- | --- | --- | -| `dimensions` | no | map of any | A map of dimension key/values to match against. All key/values must match a datapoint for it to be matched. The map values can be either a single string or a list of strings. | -| `metricNames` | no | list of strings | A list of metric names to match against | -| `metricName` | no | string | A single metric name to match against | -| `monitorType` | no | string | (**Only applicable for the top level filters**) Limits this scope of the filter to datapoints from a specific monitor. If specified, any datapoints not from this monitor type will never match against this filter. | -| `negated` | no | bool | (**Only applicable for the top level filters**) Negates the result of the match so that it matches all datapoints that do NOT match the metric name and dimension values given. This does not negate monitorType, if given. (**default:** `false`) | - - - ## datapointsToExclude The **nested** `datapointsToExclude` config object has the following fields: @@ -162,6 +147,7 @@ The **nested** `writer` config object has the following fields: | `traceExportFormat` | no | string | Format to export traces in. Choices are "zipkin" and "sapm" (**default:** `"zipkin"`) | | `datapointMaxRequests` | no | integer | Deprecated: use `maxRequests` instead. (**default:** `0`) | | `maxRequests` | no | integer | The maximum number of concurrent requests to make to a single ingest server with datapoints/events/trace spans. This number multiplied by `datapointMaxBatchSize` is more or less the maximum number of datapoints that can be "in-flight" at any given time. Same thing for the `traceSpanMaxBatchSize` option and trace spans. (**default:** `10`) | +| `timeout` | no | int64 | Timeout specifies a time limit for requests made to the ingest server. The timeout includes connection time, any redirects, and reading the response body. Default is 5 seconds, a Timeout of zero means no timeout. (**default:** `"5s"`) | | `eventSendIntervalSeconds` | no | integer | The agent does not send events immediately upon a monitor generating them, but buffers them and sends them in batches. The lower this number, the less delay for events to appear in SignalFx. (**default:** `1`) | | `propertiesMaxRequests` | no | unsigned integer | The analogue of `maxRequests` for dimension property requests. (**default:** `20`) | | `propertiesMaxBuffered` | no | unsigned integer | How many dimension property updates to hold pending being sent before dropping subsequent property updates. Property updates will be resent eventually and they are slow to change so dropping them (esp on agent start up) usually isn't a big deal. (**default:** `10000`) | @@ -170,13 +156,42 @@ The **nested** `writer` config object has the following fields: | `logDatapoints` | no | bool | If the log level is set to `debug` and this is true, all datapoints generated by the agent will be logged. (**default:** `false`) | | `logEvents` | no | bool | The analogue of `logDatapoints` for events. (**default:** `false`) | | `logTraceSpans` | no | bool | The analogue of `logDatapoints` for trace spans. (**default:** `false`) | +| `logTraceSpansFailedToShip` | no | bool | If `true`, traces and spans which weren't successfully received by the backend, will be logged as json (**default:** `false`) | | `logDimensionUpdates` | no | bool | If `true`, dimension updates will be logged at the INFO level. (**default:** `false`) | | `logDroppedDatapoints` | no | bool | If true, and the log level is `debug`, filtered out datapoints will be logged. (**default:** `false`) | | `addGlobalDimensionsAsSpanTags` | no | bool | If true, the dimensions specified in the top-level `globalDimensions` configuration will be added to the tag set of all spans that are emitted by the writer. If this is false, only the "host id" dimensions such as `host`, `AwsUniqueId`, etc. are added to the span tags. (**default:** `false`) | -| `sendTraceHostCorrelationMetrics` | no | bool | Whether to send host correlation metrics to correlation traced services with the underlying host (**default:** `true`) | -| `staleServiceTimeout` | no | int64 | How long to wait after a trace span's service name is last seen to continue sending the correlation datapoints for that service. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration. This option is irrelvant if `sendTraceHostCorrelationMetrics` is false. (**default:** `"5m"`) | -| `traceHostCorrelationMetricsInterval` | no | int64 | How frequently to send host correlation metrics that are generated from the service name seen in trace spans sent through or by the agent. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration. This option is irrelvant if `sendTraceHostCorrelationMetrics` is false. (**default:** `"1m"`) | +| `sendTraceHostCorrelationMetrics` | no | bool | Whether to send host correlation metrics to correlate traced services with the underlying host (**default:** `true`) | +| `staleServiceTimeout` | no | int64 | How long to wait after a trace span's service name is last seen to continue sending the correlation datapoints for that service. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration. This option is irrelevant if `sendTraceHostCorrelationMetrics` is false. (**default:** `"5m"`) | +| `traceHostCorrelationPurgeInterval` | no | int64 | How frequently to purge host correlation caches that are generated from the service and environment names seen in trace spans sent through or by the agent. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration. (**default:** `"1m"`) | +| `traceHostCorrelationMetricsInterval` | no | int64 | How frequently to send host correlation metrics that are generated from the service name seen in trace spans sent through or by the agent. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration. This option is irrelevant if `sendTraceHostCorrelationMetrics` is false. (**default:** `"1m"`) | +| `traceHostCorrelationMaxRequestRetries` | no | unsigned integer | How many times to retry requests related to trace host correlation (**default:** `2`) | | `maxTraceSpansInFlight` | no | unsigned integer | How many trace spans are allowed to be in the process of sending. While this number is exceeded, the oldest spans will be discarded to accommodate new spans generated to avoid memory exhaustion. If you see log messages about "Aborting pending trace requests..." or "Dropping new trace spans..." it means that the downstream target for traces is not able to accept them fast enough. Usually if the downstream is offline you will get connection refused errors and most likely spans will not build up in the agent (there is no retry mechanism). In the case of slow downstreams, you might be able to increase `maxRequests` to increase the concurrent stream of spans downstream (if the target can make efficient use of additional connections) or, less likely, increase `traceSpanMaxBatchSize` if your batches are maxing out (turn on debug logging to see the batch sizes being sent) and being split up too much. If neither of those options helps, your downstream is likely too slow to handle the volume of trace spans and should be upgraded to more powerful hardware/networking. (**default:** `100000`) | +| `splunk` | no | [object (see below)](#splunk) | Configures the writer specifically writing to Splunk. | +| `signalFxEnabled` | no | bool | If set to `false`, output to SignalFx will be disabled. (**default:** `true`) | +| `extraHeaders` | no | map of strings | Additional headers to add to any outgoing HTTP requests from the agent. | + + +## splunk +The **nested** `splunk` config object has the following fields: + + + +| Config option | Required | Type | Description | +| --- | --- | --- | --- | +| `enabled` | no | bool | Enable logging to a Splunk Enterprise instance (**default:** `false`) | +| `url` | no | string | Full URL (including path) of Splunk HTTP Event Collector (HEC) endpoint | +| `token` | no | string | Splunk HTTP Event Collector token | +| `source` | no | string | Splunk source field value, description of the source of the event | +| `sourceType` | no | string | Splunk source type, optional name of a sourcetype field value | +| `index` | no | string | Splunk index, optional name of the Splunk index to store the event in | +| `eventsIndex` | no | string | Splunk index, specifically for traces (must be event type) | +| `eventsSource` | no | string | Splunk source field value, description of the source of the trace | +| `eventsSourceType` | no | string | Splunk trace source type, optional name of a sourcetype field value | +| `skipTLSVerify` | no | bool | Skip verifying the certificate of the HTTP Event Collector (**default:** `false`) | +| `maxBuffered` | no | integer | The maximum number of Splunk log entries of all types (e.g. metric, event) to be buffered before old events are dropped. Defaults to the writer.maxDatapointsBuffered config if not specified. (**default:** `0`) | +| `maxRequests` | no | integer | The maximum number of simultaneous requests to the Splunk HEC endpoint. Defaults to the writer.maxBuffered config if not specified. (**default:** `0`) | +| `maxBatchSize` | no | integer | The maximum number of Splunk log entries to submit in one request to the HEC (**default:** `0`) | + @@ -265,7 +280,8 @@ For more information about how to use config sources, see [Remote Config](./remo | Config option | Required | Type | Description | | --- | --- | --- | --- | -| `watch` | no | bool | Whether to watch config sources for changes. If this is `true` and any of the config changes (either the main agent.yaml, or remote config values), the agent will dynamically reconfigure itself with minimal disruption. This is generally better than restarting the agent on config changes since that can result in larger gaps in metric data. The main disadvantage of watching is slightly greater network and compute resource usage. This option is not itself watched for changes. If you change the value of this option, you must restart the agent. (**default:** `true`) | +| `watch` | no | bool | Whether to watch config sources for changes. If this is `true` and the main agent.yaml changes, the agent will dynamically reconfigure itself with minimal disruption. This is generally better than restarting the agent on config changes since that can result in larger gaps in metric data. The main disadvantage of watching is slightly greater network and compute resource usage. This option is not itself watched for changes. If you change the value of this option, you must restart the agent. (**default:** `true`) | +| `remoteWatch` | no | bool | Whether to watch remote config sources for changes. If this is `true` and the remote configs changes, the agent will dynamically reconfigure itself with minimal disruption. This is generally better than restarting the agent on config changes since that can result in larger gaps in metric data. The main disadvantage of watching is slightly greater network and compute resource usage. This option is not itself watched for changes. If you change the value of this option, you must restart the agent. (**default:** `true`) | | `file` | no | [object (see below)](#file) | Configuration for other file sources | | `zookeeper` | no | [object (see below)](#zookeeper) | Configuration for a Zookeeper remote config source | | `etcd2` | no | [object (see below)](#etcd2) | Configuration for an Etcd 2 remote config source | @@ -391,11 +407,12 @@ where applicable: useFullyQualifiedHost: disableHostDimensions: false intervalSeconds: 10 + cloudMetadataTimeout: "2s" globalDimensions: - sendMachineID: false + globalSpanTags: cluster: syncClusterOnHostDimension: false - validateDiscoveryRules: true + validateDiscoveryRules: false observers: [] monitors: [] writer: @@ -405,6 +422,7 @@ where applicable: traceExportFormat: "zipkin" datapointMaxRequests: 0 maxRequests: 10 + timeout: "5s" eventSendIntervalSeconds: 1 propertiesMaxRequests: 20 propertiesMaxBuffered: 10000 @@ -413,13 +431,32 @@ where applicable: logDatapoints: false logEvents: false logTraceSpans: false + logTraceSpansFailedToShip: false logDimensionUpdates: false logDroppedDatapoints: false addGlobalDimensionsAsSpanTags: false sendTraceHostCorrelationMetrics: true staleServiceTimeout: "5m" + traceHostCorrelationPurgeInterval: "1m" traceHostCorrelationMetricsInterval: "1m" + traceHostCorrelationMaxRequestRetries: 2 maxTraceSpansInFlight: 100000 + splunk: + enabled: false + url: + token: + source: + sourceType: + index: + eventsIndex: + eventsSource: + eventsSourceType: + skipTLSVerify: false + maxBuffered: 0 + maxRequests: 0 + maxBatchSize: 0 + signalFxEnabled: true + extraHeaders: logging: level: "info" format: "text" @@ -435,7 +472,7 @@ where applicable: writeServerIPAddr: "127.9.8.7" writeServerPort: 0 configDir: "/var/run/signalfx-agent/collectd" - enableBuiltInFiltering: false + enableBuiltInFiltering: true metricsToInclude: [] metricsToExclude: [] propertiesToExclude: [] @@ -448,6 +485,7 @@ where applicable: scratch: configSources: watch: true + remoteWatch: true file: pollRateSeconds: 5 zookeeper: diff --git a/signalfx-agent/agent_docs/faq.md b/signalfx-agent/agent_docs/faq.md index a60316555..b808f857e 100644 --- a/signalfx-agent/agent_docs/faq.md +++ b/signalfx-agent/agent_docs/faq.md @@ -117,7 +117,7 @@ the agent instead of the default `/bin/signalfx-agent`, as well by adding the ``` The source for the script `/bin/umount-hostfs-non-persistent` can be [found -here](https://github.com/signalfx/signalfx-agent/blob/master/scripts/umount-hostfs-non-persistent), +here](https://github.com/signalfx/signalfx-agent/blob/main/scripts/umount-hostfs-non-persistent), but basically it just does a `umount` on all of the potentially problematic mounts that we know of. You can add arguments to the script invocation for additional directories to unmount if necessary. @@ -165,9 +165,9 @@ The primary metrics for container CPU limits are: 100,000 microseconds. The first two metrics are cumulative counters that keep growing, so the easiest -way to use them is to look at how much they change per second (the default -rollup when you look at the metrics in SignalFx). The second two are gauges -and generally don't change for the lifetime of the container. +way to use them is to look at how much they change per second using `rate` rollup +(default is `delta` when you look at the metrics in SignalFx). The second two are +gauges and generally don't change for the lifetime of the container. The maximum percentage of time a process can execute in a given second is equal to `container_spec_cpu_quota`/`container_spec_cpu_period`. For example, a @@ -200,7 +200,7 @@ time-sensitive its workload is. To monitor case #1, you can use the formula -`(container_cpu_usage_seconds_total/10000000)/(conatiner_spec_cpu_quota/container_spec_cpu_period)` +`(container_cpu_usage_seconds_total/10000000)/(container_spec_cpu_quota/container_spec_cpu_period)` to get the percentage of CPU used compared to the limit (0 - 100+). This value can actually exceed 100 because the sampling by the agent is not on a perfectly @@ -209,11 +209,10 @@ exact interval. For case #2 you need to factor in the `container_cpu_cfs_throttled_time` metric. The above metric showing usage relative to the limit will be under 100 in this case but that doesn't mean throttling isn't happening. You can simply -look at `container_cpu_cfs_throttled_time` using its default rollup of -`rate/sec` which will tell you the raw amount of time a container is spending -throttled. If you have many processes/threads in a container, this number -could be very high. You could compare throttle time to usage time with the -formula +look at `container_cpu_cfs_throttled_time` using the rollup of `rate` which +will tell you the raw amount of time a container is spending throttled. +If you have many processes/threads in a container, this number could be very +high. You could compare throttle time to usage time with the formula `container_cpu_cfs_throttled_time/container_cpu_usage_seconds_total` diff --git a/signalfx-agent/agent_docs/filtering.md b/signalfx-agent/agent_docs/filtering.md index b9da8fe46..4b3dc372a 100644 --- a/signalfx-agent/agent_docs/filtering.md +++ b/signalfx-agent/agent_docs/filtering.md @@ -21,7 +21,7 @@ to match on datapoints to determine whether to drop them. For example: ```yaml monitors: # Prometheus node exporter scraper - - type: promtheus-exporter + - type: prometheus-exporter host: 127.0.0.1 port: 9090 datapointsToExclude: @@ -56,9 +56,9 @@ on the datapoint that will not affect matching. The negation of items in either `metricNames` or a `dimensions` map value list item will serve to override (and thus reinclude) only other already excluded -items from that same list. Thus, if you want to do "whitelisting" of metric -names or dimension values.hi, you can provide a list like `[ '*', -'!whitelisted1', '!whitelisted2' ]` to exclude everything but those two +items from that same list. Thus, if you want to filter metric +names or dimension values, you can provide a list like `[ '*', +'!inclusionList1', '!inclusionList2' ]` to exclude everything but those two metrics. This, along with the regex and glob capabilities, is explained more in [Overridable Filters](#overridable-filters). @@ -80,7 +80,7 @@ globbed (i.e. where `*` is a wildcard for zero or more characters, and `?` is a wildcard for a single character) or specified as a Go-compatible regular expression (the value must be surrounded by `/` to be considered a regex). -Sometimes it is easier to whitelist the properties you want to allow through, +Sometimes it is easier to filter the properties you want to allow through, and not allow any others. You can do this by prefixing any config value with `!`, which will negate the matching value. This can be applied to any of the config options. diff --git a/signalfx-agent/agent_docs/legacy-filtering.md b/signalfx-agent/agent_docs/legacy-filtering.md index fe38a0176..2a82b7ffa 100644 --- a/signalfx-agent/agent_docs/legacy-filtering.md +++ b/signalfx-agent/agent_docs/legacy-filtering.md @@ -1,9 +1,11 @@ # Legacy Filtering -This page describes the old style of filtering and is deprecated. See [Filtering](filtering.md) for how to configure filtering in SignalFx Smart Agent 4.7.0+. +This page describes the old style of filtering and is deprecated and removed in +agent version 5.0+. See [Filtering](filtering.md) for how to configure +filtering in SignalFx Smart Agent 4.7.0+. -## Old-style whitelist filtering +## Old-style inclusion list filtering In the Smart Agent prior to version 4.7.0, custom metrics were filtered out of the agent by means of a `whitelist.json` file that was referenced under the `metricsToExclude` section of the standard distributed config file. This @@ -22,7 +24,7 @@ We recommend upgrading the agent to at least 4.7.0 and setting the `enableBuiltInFitlering: true` flag, as it is much easier to configure and understand than the old `metricsToExlude`/`metricsToInclude` mechanism. The only reason to not set `enableBuiltInFiltering: true` is if you have extensive -modifications to the whitelisted metrics (especially via `metricsToInclude`) +modifications to the allowed metrics (especially via `metricsToInclude`) and you don't want to rewrite all of that using `extraMetrics` (the new built-in filtering will filter out metrics before they can be processed by `metricsToInclude`, so that config will have no effect if built-in filtering is @@ -30,7 +32,7 @@ enabled). ## Global datapoint filtering -**PARTIALLY DEPRECATED** -- The following information is about filtering +**DEPRECATED** -- The following information is about filtering datapoints at a global level (i.e. outside of monitor configuration). It will continue to work, but we recommend putting filter definitions [at the monitor level](#additional-monitor-level-filtering) instead. @@ -52,7 +54,7 @@ used in the monitor config. This is very useful when trying to filter on a heavily overloaded dimension, such as the `plugin_instance` dimension that most collectd-based monitors emit. -Sometimes it is easier to whitelist the metrics you want to allow through. +Sometimes it is easier to filter the metrics you want to allow through. You can do this by setting the `negated` option to `true` on a filter item. This makes all metric name and dimension matching negated so that only datapoints with that name or dimension are allowed through. You would also @@ -97,29 +99,29 @@ Examples: - metricNames: - cpu* - memory* - monitorType: collectd/docker + monitorType: docker-container-stats negated: true # This indicates that you want to monitor the custom metrics # gauge.cluster.status and gauge.thread_pool.active for the # elasticsearch monitor. - metricNames: - - gauge.cluster.status - - gauge.thread_pool.active - monitorType: collectd/elasticsearch + - elasticsearch.cluster.status + - elasticsearch.thread_pool.active + monitorType: elasticsearch negated: true # This will be automatically merged with the above filter to produce one - # whitelist filter on three metric names for elasticsearch + # Filter on three metric names for elasticsearch - metricNames: - - gauge.thread_pool.inactive - monitorType: collectd/elasticsearch + - elasticsearch.thread_pool.inactive + monitorType: elasticsearch negated: true # This will override the above filter, exclusion is always favored - metricNames: - - gauge.thread_pool.inactive - monitorType: collectd/elasticsearch + - elasticsearch.thread_pool.inactive + monitorType: elasticsearch ``` ### Inclusion filtering @@ -150,4 +152,4 @@ For example: app: bigapp ``` -This can be useful for overridding the built-in whitelist for metrics. +This can be useful for overridding the built-in inclusion list for metrics. diff --git a/signalfx-agent/agent_docs/monitor-config.md b/signalfx-agent/agent_docs/monitor-config.md index e93ef2188..519dde0bb 100644 --- a/signalfx-agent/agent_docs/monitor-config.md +++ b/signalfx-agent/agent_docs/monitor-config.md @@ -37,10 +37,10 @@ The following config options are common to all monitors: | `configEndpointMappings` | | no | `map of strings` | A set of mappings from a configuration option on this monitor to attributes of a discovered endpoint. The keys are the config option on this monitor and the value can be any valid expression used in discovery rules. | | `intervalSeconds` | `0` | no | `integer` | The interval (in seconds) at which to emit datapoints from the monitor(s) created by this configuration. If not set (or set to 0), the global agent intervalSeconds config option will be used instead. | | `solo` | `false` | no | `bool` | If one or more configurations have this set to true, only those configurations will be considered. This setting can be useful for testing. | -| `metricsToExclude` | | no | `list of objects` | DEPRECATED in favor of the `datapointsToExclude` option. That option handles negation of filter items differently. | | `datapointsToExclude` | | no | `list of objects` | A list of datapoint filters. These filters allow you to comprehensively define which datapoints to exclude by metric name or dimension set, as well as the ability to define overrides to re-include metrics excluded by previous patterns within the same filter item. See [monitor filtering](./filtering.html#additional-monitor-level-filtering) for examples and more information. | | `disableHostDimensions` | `false` | no | `bool` | Some monitors pull metrics from services not running on the same host and should not get the host-specific dimensions set on them (e.g. `host`, `AWSUniqueId`, etc). Setting this to `true` causes those dimensions to be omitted. You can disable this globally with the `disableHostDimensions` option on the top level of the config. | | `disableEndpointDimensions` | `false` | no | `bool` | This can be set to true if you don't want to include the dimensions that are specific to the endpoint that was discovered by an observer. This is useful when you have an endpoint whose identity is not particularly important since it acts largely as a proxy or adapter for other metrics. | +| `metricNameTransformations` | | no | `map` | A map from _original_ metric name to a replacement value. The keys are intepreted as regular expressions and the values can contain backreferences. This means that you should escape any RE characters in the original metric name with `\` (the most common escape necessary will be `\.` as period is interpreted as "all characters" if unescaped). The [Go regexp language](https://github.com/google/re2/wiki/Syntax), and backreferences are of the form `$1`. If there are multiple entries in list of maps, they will each be run in sequence, using the transformation from the previous entry as the input the subsequent transformation. To add a common prefix to all metrics coming out of a monitor, use a mapping like this: `(.*): myprefix.$1` | | `dimensionTransformations` | | no | `map of strings` | A map from dimension names emitted by the monitor to the desired dimension name that will be emitted in the datapoint that goes to SignalFx. This can be useful if you have custom metrics from your applications and want to make the dimensions from a monitor match those. Also can be useful when scraping free-form metrics, say with the `prometheus-exporter` monitor. Right now, only static key/value transformations are supported. Note that filtering by dimensions will be done on the *original* dimension name and not the new name. Note that it is possible to remove unwanted dimensions via this configuration, by making the desired dimension name an empty string. | | `extraMetrics` | | no | `list of strings` | Extra metrics to enable besides the default included ones. This is an [overridable filter](https://docs.splunk.com/observability/gdi/smart-agent/smart-agent-resources.html#filtering-data-using-the-smart-agent). | | `extraGroups` | | no | `list of strings` | Extra metric groups to enable in addition to the metrics that are emitted by default. A metric group is simply a collection of metrics, and they are defined in each monitor's documentation. | @@ -53,6 +53,7 @@ These are all of the monitors included in the agent, along with their possible c - [appmesh](./monitors/appmesh.md) - [aspdotnet](./monitors/aspdotnet.md) - [cadvisor](./monitors/cadvisor.md) +- [cgroups](./monitors/cgroups.md) - [cloudfoundry-firehose-nozzle](./monitors/cloudfoundry-firehose-nozzle.md) - [collectd/activemq](./monitors/collectd-activemq.md) - [collectd/apache](./monitors/collectd-apache.md) @@ -65,13 +66,11 @@ These are all of the monitors included in the agent, along with their possible c - [collectd/custom](./monitors/collectd-custom.md) - [collectd/df](./monitors/collectd-df.md) - [collectd/disk](./monitors/collectd-disk.md) -- [collectd/docker](./monitors/collectd-docker.md) - [collectd/elasticsearch](./monitors/collectd-elasticsearch.md) - [collectd/etcd](./monitors/collectd-etcd.md) - [collectd/genericjmx](./monitors/collectd-genericjmx.md) - [collectd/hadoop](./monitors/collectd-hadoop.md) - [collectd/hadoopjmx](./monitors/collectd-hadoopjmx.md) -- [collectd/haproxy](./monitors/collectd-haproxy.md) - [collectd/health-checker](./monitors/collectd-health-checker.md) - [collectd/interface](./monitors/collectd-interface.md) - [collectd/jenkins](./monitors/collectd-jenkins.md) @@ -86,7 +85,9 @@ These are all of the monitors included in the agent, along with their possible c - [collectd/mongodb](./monitors/collectd-mongodb.md) - [collectd/mysql](./monitors/collectd-mysql.md) - [collectd/nginx](./monitors/collectd-nginx.md) +- [collectd/opcache](./monitors/collectd-opcache.md) - [collectd/openstack](./monitors/collectd-openstack.md) +- [collectd/php-fpm](./monitors/collectd-php-fpm.md) - [collectd/postgresql](./monitors/collectd-postgresql.md) - [collectd/processes](./monitors/collectd-processes.md) - [collectd/protocols](./monitors/collectd-protocols.md) @@ -98,6 +99,7 @@ These are all of the monitors included in the agent, along with their possible c - [collectd/spark](./monitors/collectd-spark.md) - [collectd/statsd](./monitors/collectd-statsd.md) - [collectd/systemd](./monitors/collectd-systemd.md) +- [collectd/tomcat](./monitors/collectd-tomcat.md) - [collectd/uptime](./monitors/collectd-uptime.md) - [collectd/vmem](./monitors/collectd-vmem.md) - [collectd/zookeeper](./monitors/collectd-zookeeper.md) @@ -119,14 +121,17 @@ These are all of the monitors included in the agent, along with their possible c - [gitlab-sidekiq](./monitors/gitlab-sidekiq.md) - [gitlab-unicorn](./monitors/gitlab-unicorn.md) - [gitlab-workhorse](./monitors/gitlab-workhorse.md) +- [hana](./monitors/hana.md) - [haproxy](./monitors/haproxy.md) - [heroku-metadata](./monitors/heroku-metadata.md) - [host-metadata](./monitors/host-metadata.md) +- [http](./monitors/http.md) - [internal-metrics](./monitors/internal-metrics.md) - [jaeger-grpc](./monitors/jaeger-grpc.md) - [java-monitor](./monitors/java-monitor.md) - [jmx](./monitors/jmx.md) - [kube-controller-manager](./monitors/kube-controller-manager.md) +- [kubelet-metrics](./monitors/kubelet-metrics.md) - [kubelet-stats](./monitors/kubelet-stats.md) - [kubernetes-apiserver](./monitors/kubernetes-apiserver.md) - [kubernetes-cluster](./monitors/kubernetes-cluster.md) @@ -138,27 +143,38 @@ These are all of the monitors included in the agent, along with their possible c - [logstash](./monitors/logstash.md) - [logstash-tcp](./monitors/logstash-tcp.md) - [memory](./monitors/memory.md) +- [mongodb-atlas](./monitors/mongodb-atlas.md) +- [nagios](./monitors/nagios.md) - [net-io](./monitors/net-io.md) +- [ntp](./monitors/ntp.md) - [openshift-cluster](./monitors/openshift-cluster.md) - [postgresql](./monitors/postgresql.md) +- [process](./monitors/process.md) - [processlist](./monitors/processlist.md) - [prometheus-exporter](./monitors/prometheus-exporter.md) - [prometheus/go](./monitors/prometheus-go.md) +- [prometheus/nginx-ingress](./monitors/prometheus-nginx-ingress.md) - [prometheus/nginx-vts](./monitors/prometheus-nginx-vts.md) - [prometheus/node](./monitors/prometheus-node.md) - [prometheus/postgres](./monitors/prometheus-postgres.md) - [prometheus/prometheus](./monitors/prometheus-prometheus.md) - [prometheus/redis](./monitors/prometheus-redis.md) +- [prometheus/velero](./monitors/prometheus-velero.md) - [python-monitor](./monitors/python-monitor.md) - [signalfx-forwarder](./monitors/signalfx-forwarder.md) - [sql](./monitors/sql.md) - [statsd](./monitors/statsd.md) +- [supervisor](./monitors/supervisor.md) +- [telegraf/dns](./monitors/telegraf-dns.md) +- [telegraf/exec](./monitors/telegraf-exec.md) - [telegraf/logparser](./monitors/telegraf-logparser.md) +- [telegraf/ntpq](./monitors/telegraf-ntpq.md) - [telegraf/procstat](./monitors/telegraf-procstat.md) - [telegraf/snmp](./monitors/telegraf-snmp.md) - [telegraf/sqlserver](./monitors/telegraf-sqlserver.md) - [telegraf/statsd](./monitors/telegraf-statsd.md) - [telegraf/tail](./monitors/telegraf-tail.md) +- [telegraf/varnish](./monitors/telegraf-varnish.md) - [telegraf/win_perf_counters](./monitors/telegraf-win_perf_counters.md) - [telegraf/win_services](./monitors/telegraf-win_services.md) - [trace-forwarder](./monitors/trace-forwarder.md) diff --git a/signalfx-agent/agent_docs/monitors/appmesh.md b/signalfx-agent/agent_docs/monitors/appmesh.md index 1780c4283..aa847c3e5 100644 --- a/signalfx-agent/agent_docs/monitors/appmesh.md +++ b/signalfx-agent/agent_docs/monitors/appmesh.md @@ -4,7 +4,7 @@ # appmesh -Monitor Type: `appmesh` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/appmesh)) +Monitor Type: `appmesh` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/appmesh)) **Accepts Endpoints**: No @@ -35,7 +35,7 @@ stats_sinks: ``` Please remember to provide the prefix to the agent monitor configuration. -See [Envoy API reference](https://www.envoyproxy.io/docs/envoy/latest/api-v2/config/metrics/v2/stats.proto#envoy-api-msg-config-metrics-v2-statsdsink) for more info +See [Envoy API reference](https://www.envoyproxy.io/docs/envoy/v1.6.0/api-v2/config/metrics/v2/stats.proto#envoy-api-msg-config-metrics-v2-statsdsink) for more info Sample SignalFx SmartAgent configuration: @@ -150,9 +150,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -161,19 +158,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/aspdotnet.md b/signalfx-agent/agent_docs/monitors/aspdotnet.md index 6749996b5..adfd8c8dd 100644 --- a/signalfx-agent/agent_docs/monitors/aspdotnet.md +++ b/signalfx-agent/agent_docs/monitors/aspdotnet.md @@ -4,7 +4,7 @@ # aspdotnet -Monitor Type: `aspdotnet` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/aspdotnet)) +Monitor Type: `aspdotnet` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/aspdotnet)) **Accepts Endpoints**: No @@ -62,7 +62,6 @@ This monitor emits all metrics by default; however, **none are categorized as -- they are all custom**. - - ***`asp_net.application_restarts`*** (*gauge*)
Count of ASP.NET application restarts. - ***`asp_net.applications_running`*** (*gauge*)
Number of running ASP.NET applications. - ***`asp_net.requests_current`*** (*gauge*)
Current number of ASP.NET requests. diff --git a/signalfx-agent/agent_docs/monitors/cadvisor.md b/signalfx-agent/agent_docs/monitors/cadvisor.md index b198f77aa..8aafa1029 100644 --- a/signalfx-agent/agent_docs/monitors/cadvisor.md +++ b/signalfx-agent/agent_docs/monitors/cadvisor.md @@ -4,7 +4,7 @@ # cadvisor -Monitor Type: `cadvisor` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/cadvisor)) +Monitor Type: `cadvisor` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/cadvisor)) **Accepts Endpoints**: No @@ -22,7 +22,7 @@ If you are running containers with Docker, there is a fair amount of duplication with the `docker-container-stats` monitor in terms of the metrics sent (under distinct metric names) so you may want to consider not enabling the Docker monitor in a K8s environment, or else use filtering to -whitelist only certain metrics. Note that this will cause the built-in +allow only certain metrics. Note that this will cause the built-in Docker dashboards to be blank, but container metrics will be available on the Kubernetes dashboards instead. @@ -104,9 +104,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -115,20 +112,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/cloudfoundry-firehose-nozzle.md b/signalfx-agent/agent_docs/monitors/cloudfoundry-firehose-nozzle.md index 76a0eb4d3..f765a6536 100644 --- a/signalfx-agent/agent_docs/monitors/cloudfoundry-firehose-nozzle.md +++ b/signalfx-agent/agent_docs/monitors/cloudfoundry-firehose-nozzle.md @@ -4,7 +4,7 @@ # cloudfoundry-firehose-nozzle -Monitor Type: `cloudfoundry-firehose-nozzle` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/cloudfoundry)) +Monitor Type: `cloudfoundry-firehose-nozzle` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/cloudfoundry)) **Accepts Endpoints**: No @@ -442,9 +442,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -453,20 +450,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some @@ -474,8 +457,22 @@ dimensions may be specific to certain metrics. | Name | Description | | --- | --- | -| `instance_id` | The BOSH instance id that pertains to the metric, if any. | -| `source_id` | The source of the metric | +| `app_id` | Application ID (GUID). This is equal to the value of `source_id` dimension. Only available for applications on TAS/PCF 2.8.0+ (cf-deployment v11.1.0+). | +| `app_name` | Application name. Only available for applications on TAS/PCF 2.8.0+ (cf-deployment v11.1.0+). Name change does not take effect until process is restarted on TAS 2.8. | +| `deployment` | Name of the BOSH deployment. | +| `index` | ID of the BOSH instance. | +| `instance_id` | Numerical index of the application instance for applications (`rep.` metrics). Also present for `bbs.` metrics, where it is the BOSH instance ID (equal to `index`). | +| `ip` | IP address of the BOSH instance. | +| `job` | Name of the BOSH job. | +| `organization_id` | Organization ID (GUID). Only available for applications on TAS/PCF 2.8.0+ (cf-deployment v11.1.0+). | +| `organization_name` | Organization name. Only available for applications on TAS/PCF 2.8.0+ (cf-deployment v11.1.0+). Name change does not take effect until process is restarted on TAS 2.8. | +| `origin` | Origin name of the metric. Equal to the prefix of the metric name. | +| `process_id` | Process ID. Only present for applications (`rep.` metrics). For a process of type "web" (main process of an application), this is equal to `source_id` and `app_id`. | +| `process_instance_id` | Unique ID for the application process instance. Only present for applications (`rep.` metrics). | +| `process_type` | Type of the process (each application has one process with type "web"). Only present for applications (`rep.` metrics). | +| `source_id` | For application container metrics, this is the GUID of the application (equal to `app_id`), for system metrics, this is the origin name (equal to `origin`). | +| `space_id` | Space ID (GUID). Only available for applications on TAS/PCF 2.8.0+ (cf-deployment v11.1.0+). | +| `space_name` | Space name. Only available for applications on TAS/PCF 2.8.0+ (cf-deployment v11.1.0+). Name change does not take effect until process is restarted on TAS 2.8. | diff --git a/signalfx-agent/agent_docs/monitors/collectd-activemq.md b/signalfx-agent/agent_docs/monitors/collectd-activemq.md index fd673f054..f8f91d181 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-activemq.md +++ b/signalfx-agent/agent_docs/monitors/collectd-activemq.md @@ -4,7 +4,7 @@ # collectd/activemq -Monitor Type: `collectd/activemq` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/activemq)) +Monitor Type: `collectd/activemq` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/activemq)) **Accepts Endpoints**: **Yes** @@ -98,6 +98,8 @@ Metrics that are categorized as [container/host](https://docs.splunk.com/observability/admin/subscription-usage/monitor-imm-billing-usage.html#about-custom-bundled-and-high-resolution-metrics) (*default*) are ***in bold and italics*** in the list below. +This monitor will also emit by default any metrics that are not listed below. + - ***`counter.amq.TotalConnectionsCount`*** (*counter*)
Total connections count per broker - ***`gauge.amq.TotalConsumerCount`*** (*gauge*)
Total number of consumers subscribed to destinations on the broker @@ -147,9 +149,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -158,19 +157,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-apache.md b/signalfx-agent/agent_docs/monitors/collectd-apache.md index dcf504b63..1838af750 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-apache.md +++ b/signalfx-agent/agent_docs/monitors/collectd-apache.md @@ -4,7 +4,7 @@ # collectd/apache -Monitor Type: `collectd/apache` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/apache)) +Monitor Type: `collectd/apache` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/apache)) **Accepts Endpoints**: **Yes** @@ -136,9 +136,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -147,20 +144,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/collectd-cassandra.md b/signalfx-agent/agent_docs/monitors/collectd-cassandra.md index bf9158684..fddda5ae9 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-cassandra.md +++ b/signalfx-agent/agent_docs/monitors/collectd-cassandra.md @@ -4,7 +4,7 @@ # collectd/cassandra -Monitor Type: `collectd/cassandra` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/cassandra)) +Monitor Type: `collectd/cassandra` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/cassandra)) **Accepts Endpoints**: **Yes** @@ -90,6 +90,20 @@ Metrics that are categorized as [container/host](https://docs.splunk.com/observability/admin/subscription-usage/monitor-imm-billing-usage.html#about-custom-bundled-and-high-resolution-metrics) (*default*) are ***in bold and italics*** in the list below. +This monitor will also emit by default any metrics that are not listed below. + + + - `counter.cassandra.ClientRequest.CASRead.Latency.Count` (*cumulative*)
Count of transactional read operations since server start. + - `counter.cassandra.ClientRequest.CASRead.TotalLatency.Count` (*cumulative*)
The total number of microseconds elapsed in servicing client transactional read requests. + + It can be devided by `counter.cassandra.ClientRequest.CASRead.Latency.Count` + to find the real time transactional read latency. + + - `counter.cassandra.ClientRequest.CASWrite.Latency.Count` (*cumulative*)
Count of transactional write operations since server start. + - `counter.cassandra.ClientRequest.CASWrite.TotalLatency.Count` (*cumulative*)
The total number of microseconds elapsed in servicing client transactional write requests. + + It can be devided by `counter.cassandra.ClientRequest.CASWrite.Latency.Count` + to find the real time transactional write latency. - ***`counter.cassandra.ClientRequest.RangeSlice.Latency.Count`*** (*cumulative*)
Count of range slice operations since server start. This typically indicates a server overload condition. @@ -108,6 +122,7 @@ Metrics that are categorized as - one or more clients are directing more load to this server than the others - the server is experiencing hardware or software issues and may require maintenance. + - `counter.cassandra.ClientRequest.RangeSlice.TotalLatency.Count` (*cumulative*)
The total number of microseconds elapsed in servicing range slice requests. - ***`counter.cassandra.ClientRequest.RangeSlice.Unavailables.Count`*** (*cumulative*)
Count of range slice unavailables since server start. A non-zero value means that insufficient replicas were available to fulfil a range slice request at the requested consistency level. @@ -115,7 +130,7 @@ Metrics that are categorized as This typically means that one or more nodes are down. To fix this condition, any down nodes must be restarted, or removed from the cluster. - - ***`counter.cassandra.ClientRequest.Read.Latency.Count`*** (*cumulative*)
Count of read operations since server start + - ***`counter.cassandra.ClientRequest.Read.Latency.Count`*** (*cumulative*)
Count of read operations since server start. - ***`counter.cassandra.ClientRequest.Read.Timeouts.Count`*** (*cumulative*)
Count of read timeouts since server start. This typically indicates a server overload condition. If this value is increasing across the cluster then the cluster is too small for the application read load. @@ -124,6 +139,11 @@ Metrics that are categorized as - one or more clients are directing more load to this server than the others - the server is experiencing hardware or software issues and may require maintenance. + - `counter.cassandra.ClientRequest.Read.TotalLatency.Count` (*cumulative*)
The total number of microseconds elapsed in servicing client read requests. + + It can be devided by `counter.cassandra.ClientRequest.Read.Latency.Count` + to find the real time read latency. + - ***`counter.cassandra.ClientRequest.Read.Unavailables.Count`*** (*cumulative*)
Count of read unavailables since server start. A non-zero value means that insufficient replicas were available to fulfil a read request at the requested consistency level. This typically means that one or more @@ -139,6 +159,11 @@ Metrics that are categorized as - one or more clients are directing more load to this server than the others - the server is experiencing hardware or software issues and may require maintenance. + - `counter.cassandra.ClientRequest.Write.TotalLatency.Count` (*cumulative*)
The total number of microseconds elapsed in servicing client write requests. + + It can be devided by `counter.cassandra.ClientRequest.Write.Latency.Count` + to find the real time write latency. + - ***`counter.cassandra.ClientRequest.Write.Unavailables.Count`*** (*cumulative*)
Count of write unavailables since server start. A non-zero value means that insufficient replicas were available to fulfil a write request at the requested consistency level. @@ -151,6 +176,34 @@ Metrics that are categorized as not increase steadily over time then the node may be experiencing problems completing compaction operations. + - `counter.cassandra.Storage.Exceptions.Count` (*cumulative*)
Number of internal exceptions caught. Under normal exceptions this should be zero. + + - ***`counter.cassandra.Storage.Load.Count`*** (*cumulative*)
Storage used for Cassandra data in bytes. Use this metric to see how much storage is being used for data by a Cassandra node. + + The value of this metric is influenced by: + - Total data stored into the database + - compaction behavior + + - `counter.cassandra.Storage.TotalHints.Count` (*cumulative*)
Total hints since node start. Indicates that write operations cannot be + delivered to a node, usually because a node is down. If this value is + increasing and all nodes are up then there may be some connectivity + issue between nodes in the cluster. + + - ***`counter.cassandra.Storage.TotalHintsInProgress.Count`*** (*cumulative*)
Total pending hints. Indicates that write operations cannot be + delivered to a node, usually because a node is down. If this value is + increasing and all nodes are up then there may be some connectivity + issue between nodes in the cluster. + + - `gauge.cassandra.ClientRequest.CASRead.Latency.50thPercentile` (*gauge*)
50th percentile (median) of Cassandra transactional read latency. + + - `gauge.cassandra.ClientRequest.CASRead.Latency.99thPercentile` (*gauge*)
99th percentile of Cassandra transactional read latency. + + - `gauge.cassandra.ClientRequest.CASRead.Latency.Max` (*gauge*)
Maximum Cassandra transactional read latency. + - `gauge.cassandra.ClientRequest.CASWrite.Latency.50thPercentile` (*gauge*)
50th percentile (median) of Cassandra transactional write latency. + + - `gauge.cassandra.ClientRequest.CASWrite.Latency.99thPercentile` (*gauge*)
99th percentile of Cassandra transactional write latency. + + - `gauge.cassandra.ClientRequest.CASWrite.Latency.Max` (*gauge*)
Maximum Cassandra transactional write latency. - `gauge.cassandra.ClientRequest.RangeSlice.Latency.50thPercentile` (*gauge*)
50th percentile (median) of Cassandra range slice latency. This value should be similar across all nodes in the cluster. If some nodes have higher values than the rest of the cluster then they may have more connected clients @@ -161,7 +214,7 @@ Metrics that are categorized as the rest of the cluster then they may have more connected clients or may be experiencing heavier than usual compaction load. - - `gauge.cassandra.ClientRequest.RangeSlice.Latency.Max` (*gauge*)
Maximum Cassandra range slice latency + - `gauge.cassandra.ClientRequest.RangeSlice.Latency.Max` (*gauge*)
Maximum Cassandra range slice latency. - ***`gauge.cassandra.ClientRequest.Read.Latency.50thPercentile`*** (*gauge*)
50th percentile (median) of Cassandra read latency. This value should be similar across all nodes in the cluster. If some nodes have higher values than the rest of the cluster then they may have more connected @@ -172,7 +225,7 @@ Metrics that are categorized as the rest of the cluster then they may have more connected clients or may be experiencing heavier than usual compaction load. - - ***`gauge.cassandra.ClientRequest.Read.Latency.Max`*** (*gauge*)
Maximum Cassandra read latency + - ***`gauge.cassandra.ClientRequest.Read.Latency.Max`*** (*gauge*)
Maximum Cassandra read latency. - ***`gauge.cassandra.ClientRequest.Write.Latency.50thPercentile`*** (*gauge*)
50th percentile (median) of Cassandra write latency. This value should be similar across all nodes in the cluster. If some nodes have higher values than the rest of the cluster then they may have more connected @@ -188,22 +241,6 @@ Metrics that are categorized as continually increasing then the node may be experiencing problems completing compaction operations. - - ***`gauge.cassandra.Storage.Load.Count`*** (*gauge*)
Storage used for Cassandra data in bytes. Use this metric to see how much storage is being used for data by a Cassandra node. - - The value of this metric is influenced by: - - Total data stored into the database - - compaction behavior - - - `gauge.cassandra.Storage.TotalHints.Count` (*gauge*)
Total hints since node start. Indicates that write operations cannot be - delivered to a node, usually because a node is down. If this value is - increasing and all nodes are up then there may be some connectivity - issue between nodes in the cluster. - - - ***`gauge.cassandra.Storage.TotalHintsInProgress.Count`*** (*gauge*)
Total pending hints. Indicates that write operations cannot be - delivered to a node, usually because a node is down. If this value is - increasing and all nodes are up then there may be some connectivity - issue between nodes in the cluster. - #### Group jvm All of the following metrics are part of the `jvm` metric group. All of @@ -220,9 +257,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -231,19 +265,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-chrony.md b/signalfx-agent/agent_docs/monitors/collectd-chrony.md index cf3386814..902fac3a6 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-chrony.md +++ b/signalfx-agent/agent_docs/monitors/collectd-chrony.md @@ -4,7 +4,7 @@ # collectd/chrony -Monitor Type: `collectd/chrony` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/chrony)) +Monitor Type: `collectd/chrony` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/chrony)) **Accepts Endpoints**: No diff --git a/signalfx-agent/agent_docs/monitors/collectd-consul.md b/signalfx-agent/agent_docs/monitors/collectd-consul.md index 781bf40af..153bce4fb 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-consul.md +++ b/signalfx-agent/agent_docs/monitors/collectd-consul.md @@ -4,7 +4,7 @@ # collectd/consul -Monitor Type: `collectd/consul` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/consul)) +Monitor Type: `collectd/consul` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/consul)) **Accepts Endpoints**: **Yes** @@ -145,9 +145,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -156,20 +153,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/collectd-couchbase.md b/signalfx-agent/agent_docs/monitors/collectd-couchbase.md index b40854387..cabdc8c5e 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-couchbase.md +++ b/signalfx-agent/agent_docs/monitors/collectd-couchbase.md @@ -4,7 +4,7 @@ # collectd/couchbase -Monitor Type: `collectd/couchbase` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/couchbase)) +Monitor Type: `collectd/couchbase` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/couchbase)) **Accepts Endpoints**: **Yes** @@ -100,48 +100,232 @@ monitor config option `extraGroups`: - `gauge.bucket.basic.memUsed` (*gauge*)
Amount of memory used by the bucket (bytes) - ***`gauge.bucket.basic.opsPerSec`*** (*gauge*)
Number of operations per second - ***`gauge.bucket.basic.quotaPercentUsed`*** (*gauge*)
Percentage of RAM used (for active objects) against the configure bucket size (%) + - `gauge.bucket.hot_keys.0` (*gauge*)
One of the most used keys in this bucket + - `gauge.bucket.hot_keys.1` (*gauge*)
One of the most used keys in this bucket + - `gauge.bucket.hot_keys.10` (*gauge*)
One of the most used keys in this bucket + - `gauge.bucket.hot_keys.2` (*gauge*)
One of the most used keys in this bucket + - `gauge.bucket.hot_keys.3` (*gauge*)
One of the most used keys in this bucket + - `gauge.bucket.hot_keys.4` (*gauge*)
One of the most used keys in this bucket + - `gauge.bucket.hot_keys.5` (*gauge*)
One of the most used keys in this bucket + - `gauge.bucket.hot_keys.6` (*gauge*)
One of the most used keys in this bucket + - `gauge.bucket.hot_keys.7` (*gauge*)
One of the most used keys in this bucket + - `gauge.bucket.hot_keys.8` (*gauge*)
One of the most used keys in this bucket + - `gauge.bucket.hot_keys.9` (*gauge*)
One of the most used keys in this bucket + - `gauge.bucket.op.avg_bg_wait_time` (*gauge*)
Average background wait time + - `gauge.bucket.op.avg_disk_commit_time` (*gauge*)
Average disk commit time + - `gauge.bucket.op.avg_disk_update_time` (*gauge*)
Average disk update time + - `gauge.bucket.op.bg_wait_count` (*gauge*)
+ - `gauge.bucket.op.bg_wait_total` (*gauge*)
The total background wait time + - `gauge.bucket.op.bytes_read` (*gauge*)
Number of bytes read + - `gauge.bucket.op.bytes_written` (*gauge*)
Number of bytes written + - `gauge.bucket.op.cas_badval` (*gauge*)
Number of CAS operations per second using an incorrect CAS ID for data that this bucket contains + - `gauge.bucket.op.cas_hits` (*gauge*)
Number of CAS operations per second for data that this bucket contains + - `gauge.bucket.op.cas_misses` (*gauge*)
Number of CAS operations per second for data that this bucket does not contain - ***`gauge.bucket.op.cmd_get`*** (*gauge*)
requested objects - - ***`gauge.bucket.op.couch_docs_fragmentation`*** (*gauge*)
Percent fragmentation of documents in this bucket. + - `gauge.bucket.op.cmd_set` (*gauge*)
Number of writes (set operations) per second to this bucket + - `gauge.bucket.op.couch_docs_actual_disk_size` (*gauge*)
The size of the couchbase docs on disk + - `gauge.bucket.op.couch_docs_data_size` (*gauge*)
The size of active data in this bucket + - `gauge.bucket.op.couch_docs_disk_size` (*gauge*)
Couch docs total size in bytes + - ***`gauge.bucket.op.couch_docs_fragmentation`*** (*gauge*)
Percent fragmentation of documents in this bucket + - `gauge.bucket.op.couch_spatial_data_size` (*gauge*)
The size of object data for spatial views + - `gauge.bucket.op.couch_spatial_disk_size` (*gauge*)
The amount of disk space occupied by spatial views + - `gauge.bucket.op.couch_spatial_ops` (*gauge*)
Number of spatial operations + - `gauge.bucket.op.couch_total_disk_size` (*gauge*)
The total size on disk of all data and view files for this bucket + - `gauge.bucket.op.couch_views_actual_disk_size` (*gauge*)
The size of all active items in all the indexes for this bucket on disk + - `gauge.bucket.op.couch_views_data_size` (*gauge*)
The size of object data for views + - `gauge.bucket.op.couch_views_disk_size` (*gauge*)
The amount of disk space occupied by views + - `gauge.bucket.op.couch_views_fragmentation` (*gauge*)
How much fragmented data there is to be compacted compared to real data for the view index files in this bucket - ***`gauge.bucket.op.couch_views_ops`*** (*gauge*)
view operations per second + - `gauge.bucket.op.cpu_idle_ms` (*gauge*)
CPU Idle milliseconds + - `gauge.bucket.op.cpu_utilization_rate` (*gauge*)
Percentage of CPU in use across all available cores on this server - ***`gauge.bucket.op.curr_connections`*** (*gauge*)
open connection per bucket - `gauge.bucket.op.curr_items` (*gauge*)
total number of stored items per bucket + - `gauge.bucket.op.curr_items_tot` (*gauge*)
Total number of items + - `gauge.bucket.op.decr_hits` (*gauge*)
Number of decrement hits + - `gauge.bucket.op.decr_misses` (*gauge*)
Number of decrement misses + - `gauge.bucket.op.delete_hits` (*gauge*)
Number of delete hits + - `gauge.bucket.op.delete_misses` (*gauge*)
Number of delete misses + - `gauge.bucket.op.disk_commit_count` (*gauge*)
Number of disk commits + - `gauge.bucket.op.disk_commit_total` (*gauge*)
Total number of disk commits + - `gauge.bucket.op.disk_update_count` (*gauge*)
Number of disk updates + - `gauge.bucket.op.disk_update_total` (*gauge*)
Total number of disk updates - `gauge.bucket.op.disk_write_queue` (*gauge*)
number of items waiting to be written to disk - ***`gauge.bucket.op.ep_bg_fetched`*** (*gauge*)
number of items fetched from disk - ***`gauge.bucket.op.ep_cache_miss_rate`*** (*gauge*)
ratio of requested objects found in cache vs retrieved from disk + - `gauge.bucket.op.ep_dcp_2i_backoff` (*gauge*)
Number of backoffs for indexes DCP connections + - `gauge.bucket.op.ep_dcp_2i_count` (*gauge*)
Number of indexes DCP connections + - `gauge.bucket.op.ep_dcp_2i_items_remaining` (*gauge*)
Number of indexes items remaining to be sent + - `gauge.bucket.op.ep_dcp_2i_items_sent` (*gauge*)
Number of indexes items sent + - `gauge.bucket.op.ep_dcp_2i_producer_count` (*gauge*)
Number of indexes producers + - `gauge.bucket.op.ep_dcp_2i_total_backlog_size` (*gauge*)
Number of items in dcp backlog + - `gauge.bucket.op.ep_dcp_2i_total_bytes` (*gauge*)
Number bytes per second being sent for indexes DCP connections + - `gauge.bucket.op.ep_dcp_other_backoff` (*gauge*)
Number of backoffs for other DCP connections + - `gauge.bucket.op.ep_dcp_other_count` (*gauge*)
Number of other DCP connections + - `gauge.bucket.op.ep_dcp_other_items_remaining` (*gauge*)
Number of other items remaining to be sent + - `gauge.bucket.op.ep_dcp_other_items_sent` (*gauge*)
Number of other items sent + - `gauge.bucket.op.ep_dcp_other_producer_count` (*gauge*)
Number of other producers + - `gauge.bucket.op.ep_dcp_other_total_backlog_size` (*gauge*)
Number of remaining items for replication + - `gauge.bucket.op.ep_dcp_other_total_bytes` (*gauge*)
Number bytes per second being sent for other DCP connections + - `gauge.bucket.op.ep_dcp_replica_backoff` (*gauge*)
Number of backoffs for replica DCP connections + - `gauge.bucket.op.ep_dcp_replica_count` (*gauge*)
Number of replica DCP connections + - `gauge.bucket.op.ep_dcp_replica_items_remaining` (*gauge*)
Number of replica items remaining to be sent + - `gauge.bucket.op.ep_dcp_replica_items_sent` (*gauge*)
Number of replica items sent + - `gauge.bucket.op.ep_dcp_replica_producer_count` (*gauge*)
Number of replica producers + - `gauge.bucket.op.ep_dcp_replica_total_backlog_size` (*gauge*)
Number of remaining items for replication + - `gauge.bucket.op.ep_dcp_replica_total_bytes` (*gauge*)
Number bytes per second being sent for replica DCP connections + - `gauge.bucket.op.ep_dcp_views_backoff` (*gauge*)
Number of backoffs for views DCP connections + - `gauge.bucket.op.ep_dcp_views_count` (*gauge*)
Number of views DCP connections + - `gauge.bucket.op.ep_dcp_views_items_remaining` (*gauge*)
Number of views items remaining to be sent + - `gauge.bucket.op.ep_dcp_views_items_sent` (*gauge*)
Number of view items sent + - `gauge.bucket.op.ep_dcp_views_producer_count` (*gauge*)
Number of views producers + - `gauge.bucket.op.ep_dcp_views_total_bytes` (*gauge*)
Number bytes per second being sent for views DCP connections + - `gauge.bucket.op.ep_dcp_xdcr_backoff` (*gauge*)
Number of backoffs for xdcr DCP connections + - `gauge.bucket.op.ep_dcp_xdcr_count` (*gauge*)
Number of xdcr DCP connections + - `gauge.bucket.op.ep_dcp_xdcr_items_remaining` (*gauge*)
Number of xdcr items remaining to be sent + - `gauge.bucket.op.ep_dcp_xdcr_items_sent` (*gauge*)
Number of xdcr items sent + - `gauge.bucket.op.ep_dcp_xdcr_producer_count` (*gauge*)
Number of xdcr producers + - `gauge.bucket.op.ep_dcp_xdcr_total_backlog_size` (*gauge*)
Number of items waiting replication + - `gauge.bucket.op.ep_dcp_xdcr_total_bytes` (*gauge*)
Number bytes per second being sent for xdcr DCP connections - ***`gauge.bucket.op.ep_diskqueue_drain`*** (*gauge*)
items removed from disk queue - ***`gauge.bucket.op.ep_diskqueue_fill`*** (*gauge*)
enqueued items on disk queue + - `gauge.bucket.op.ep_diskqueue_items` (*gauge*)
The number of items waiting to be written to disk for this bucket for this state + - `gauge.bucket.op.ep_flusher_todo` (*gauge*)
Number of items currently being written + - `gauge.bucket.op.ep_item_commit_failed` (*gauge*)
Number of times a transaction failed to commit due to storage errors + - `gauge.bucket.op.ep_kv_size` (*gauge*)
Total amount of user data cached in RAM in this bucket + - `gauge.bucket.op.ep_max_size` (*gauge*)
The maximum amount of memory this bucket can use - ***`gauge.bucket.op.ep_mem_high_wat`*** (*gauge*)
memory high water mark - point at which active objects begin to be ejected from bucket - `gauge.bucket.op.ep_mem_low_wat` (*gauge*)
memory low water mark + - `gauge.bucket.op.ep_meta_data_memory` (*gauge*)
Total amount of item metadata consuming RAM in this bucket + - `gauge.bucket.op.ep_num_non_resident` (*gauge*)
Number of non-resident items + - `gauge.bucket.op.ep_num_ops_del_meta` (*gauge*)
Number of delete operations per second for this bucket as the target for XDCR + - `gauge.bucket.op.ep_num_ops_del_ret_meta` (*gauge*)
Number of delRetMeta operations per second for this bucket as the target for XDCR + - `gauge.bucket.op.ep_num_ops_get_meta` (*gauge*)
Number of read operations per second for this bucket as the target for XDCR + - `gauge.bucket.op.ep_num_ops_set_meta` (*gauge*)
Number of set operations per second for this bucket as the target for XDCR + - `gauge.bucket.op.ep_num_ops_set_ret_meta` (*gauge*)
Number of setRetMeta operations per second for this bucket as the target for XDCR - ***`gauge.bucket.op.ep_num_value_ejects`*** (*gauge*)
number of objects ejected out of the bucket - ***`gauge.bucket.op.ep_oom_errors`*** (*gauge*)
request rejected - bucket is at quota, panic + - `gauge.bucket.op.ep_ops_create` (*gauge*)
Total number of new items being inserted into this bucket + - `gauge.bucket.op.ep_ops_update` (*gauge*)
Number of update operations + - `gauge.bucket.op.ep_overhead` (*gauge*)
Extra memory used by transient data like persistence queues or checkpoints - ***`gauge.bucket.op.ep_queue_size`*** (*gauge*)
number of items queued for storage + - `gauge.bucket.op.ep_resident_items_rate` (*gauge*)
Number of resident items + - `gauge.bucket.op.ep_tap_rebalance_count` (*gauge*)
Number of internal rebalancing TAP queues in this bucket + - `gauge.bucket.op.ep_tap_rebalance_qlen` (*gauge*)
Number of items in the rebalance TAP queues in this bucket + - `gauge.bucket.op.ep_tap_rebalance_queue_backfillremaining` (*gauge*)
Number of items in the backfill queues of rebalancing TAP connections to this bucket + - `gauge.bucket.op.ep_tap_rebalance_queue_backoff` (*gauge*)
Number of back-offs received per second while sending data over rebalancing TAP connections to this bucket + - `gauge.bucket.op.ep_tap_rebalance_queue_drain` (*gauge*)
Number of items per second being sent over rebalancing TAP connections to this bucket, i.e. removed from queue. + - `gauge.bucket.op.ep_tap_rebalance_queue_itemondisk` (*gauge*)
Number of items still on disk to be loaded for rebalancing TAP connections to this bucket + - `gauge.bucket.op.ep_tap_rebalance_total_backlog_size` (*gauge*)
Number of items remaining for replication + - `gauge.bucket.op.ep_tap_replica_count` (*gauge*)
Number of internal replication TAP queues in this bucket + - `gauge.bucket.op.ep_tap_replica_qlen` (*gauge*)
Number of items in the replication TAP queues in this bucket + - `gauge.bucket.op.ep_tap_replica_queue_backoff` (*gauge*)
Number of back-offs received per second while sending data over replication TAP connections to this bucket + - `gauge.bucket.op.ep_tap_replica_queue_drain` (*gauge*)
Number of items per second being sent over replication TAP connections to this bucket, i.e. removed from queue + - `gauge.bucket.op.ep_tap_replica_queue_itemondisk` (*gauge*)
Number of items still on disk to be loaded for replication TAP connections to this bucket + - `gauge.bucket.op.ep_tap_replica_total_backlog_size` (*gauge*)
Number of remaining items for replication + - `gauge.bucket.op.ep_tap_total_count` (*gauge*)
Total number of internal TAP queues in this bucket + - `gauge.bucket.op.ep_tap_total_qlen` (*gauge*)
Total number of items in TAP queues in this bucket + - `gauge.bucket.op.ep_tap_total_queue_backfillremaining` (*gauge*)
Total number of items in the backfill queues of TAP connections to this bucket + - `gauge.bucket.op.ep_tap_total_queue_backoff` (*gauge*)
Total number of back-offs received per second while sending data over TAP connections to this bucket + - `gauge.bucket.op.ep_tap_total_queue_drain` (*gauge*)
Total number of items per second being sent over TAP connections to this bucket + - `gauge.bucket.op.ep_tap_total_queue_fill` (*gauge*)
Total enqueued items in the queue. + - `gauge.bucket.op.ep_tap_total_queue_itemondisk` (*gauge*)
The number of items waiting to be written to disk for this bucket for this state. + - `gauge.bucket.op.ep_tap_total_total_backlog_size` (*gauge*)
Number of remaining items for replication + - `gauge.bucket.op.ep_tap_user_count` (*gauge*)
Number of internal user TAP queues in this bucket + - `gauge.bucket.op.ep_tap_user_qlen` (*gauge*)
Number of items in user TAP queues in this bucket + - `gauge.bucket.op.ep_tap_user_queue_backfillremaining` (*gauge*)
Number of items in the backfill queues of user TAP connections to this bucket. + - `gauge.bucket.op.ep_tap_user_queue_backoff` (*gauge*)
Number of back-offs received per second while sending data over user TAP connections to this bucket + - `gauge.bucket.op.ep_tap_user_queue_drain` (*gauge*)
Number of items per second being sent over user TAP connections to this bucket, i.e. removed from queue + - `gauge.bucket.op.ep_tap_user_queue_fill` (*gauge*)
Number of items per second being put on the user TAP queues + - `gauge.bucket.op.ep_tap_user_queue_itemondisk` (*gauge*)
Number of items still on disk to be loaded for client TAP connections to this bucket + - `gauge.bucket.op.ep_tap_user_total_backlog_size` (*gauge*)
Number of remaining items for replication - ***`gauge.bucket.op.ep_tmp_oom_errors`*** (*gauge*)
request rejected - couchbase is making room by ejecting objects, try again later + - `gauge.bucket.op.ep_vb_total` (*gauge*)
Total number of vBuckets for this bucket + - `gauge.bucket.op.evictions` (*gauge*)
Number of evictions + - `gauge.bucket.op.get_hits` (*gauge*)
Number of get hits + - `gauge.bucket.op.get_misses` (*gauge*)
Number of get misses + - `gauge.bucket.op.hibernated_requests` (*gauge*)
Number of streaming requests now idle + - `gauge.bucket.op.hibernated_waked` (*gauge*)
Rate of streaming request wakeups + - `gauge.bucket.op.hit_ratio` (*gauge*)
Hit ratio. + - `gauge.bucket.op.incr_hits` (*gauge*)
Number of increment hits + - `gauge.bucket.op.incr_misses` (*gauge*)
Number of increment misses + - `gauge.bucket.op.mem_actual_free` (*gauge*)
Amount of RAM available + - `gauge.bucket.op.mem_actual_used` (*gauge*)
Used memory + - `gauge.bucket.op.mem_free` (*gauge*)
Free memory + - `gauge.bucket.op.mem_total` (*gauge*)
Total available memory - ***`gauge.bucket.op.mem_used`*** (*gauge*)
memory used + - `gauge.bucket.op.mem_used_sys` (*gauge*)
System memory usage + - `gauge.bucket.op.misses` (*gauge*)
Total number of misses - `gauge.bucket.op.ops` (*gauge*)
total of gets, sets, increment and decrement + - `gauge.bucket.op.rest_requests` (*gauge*)
Number of HTTP requests + - `gauge.bucket.op.swap_total` (*gauge*)
Total amount of swap available + - `gauge.bucket.op.swap_used` (*gauge*)
Amount of swap used + - `gauge.bucket.op.vb_active_eject` (*gauge*)
Number of items per second being ejected to disk from active vBuckets + - `gauge.bucket.op.vb_active_itm_memory` (*gauge*)
Amount of active user data cached in RAM in this bucket + - `gauge.bucket.op.vb_active_meta_data_memory` (*gauge*)
Amount of active item metadata consuming RAM in this bucket + - `gauge.bucket.op.vb_active_num` (*gauge*)
Number of vBuckets in the active state for this bucket + - `gauge.bucket.op.vb_active_num_non_resident` (*gauge*)
Number of non resident vBuckets in the active state for this bucket + - `gauge.bucket.op.vb_active_ops_create` (*gauge*)
New items per second being inserted into active vBuckets in this bucket + - `gauge.bucket.op.vb_active_ops_update` (*gauge*)
Number of items updated on active vBucket per second for this bucket + - `gauge.bucket.op.vb_active_queue_age` (*gauge*)
Sum of disk queue item age in milliseconds + - `gauge.bucket.op.vb_active_queue_drain` (*gauge*)
Number of active items per second being written to disk in this bucket + - `gauge.bucket.op.vb_active_queue_fill` (*gauge*)
Number of active items per second being put on the active item disk queue in this bucket + - `gauge.bucket.op.vb_active_queue_size` (*gauge*)
Number of active items waiting to be written to disk in this bucket - ***`gauge.bucket.op.vb_active_resident_items_ratio`*** (*gauge*)
ratio of items kept in memory vs stored on disk - - `gauge.bucket.quota.ram` (*gauge*)
Amount of RAM used by the bucket (bytes). - - `gauge.bucket.quota.rawRAM` (*gauge*)
Amount of raw RAM used by the bucket (bytes). + - `gauge.bucket.op.vb_avg_active_queue_age` (*gauge*)
Average age in seconds of active items in the active item queue for this bucket + - `gauge.bucket.op.vb_avg_pending_queue_age` (*gauge*)
Average age in seconds of pending items in the pending item queue for this bucket and should be transient during rebalancing + - `gauge.bucket.op.vb_avg_replica_queue_age` (*gauge*)
Average age in seconds of replica items in the replica item queue for this bucket + - `gauge.bucket.op.vb_avg_total_queue_age` (*gauge*)
Average age of items in the queue + - `gauge.bucket.op.vb_pending_curr_items` (*gauge*)
Number of items in pending vBuckets in this bucket and should be transient during rebalancing + - `gauge.bucket.op.vb_pending_eject` (*gauge*)
Number of items per second being ejected to disk from pending vBuckets + - `gauge.bucket.op.vb_pending_itm_memory` (*gauge*)
Amount of pending user data cached in RAM in this bucket and should be transient during rebalancing + - `gauge.bucket.op.vb_pending_meta_data_memory` (*gauge*)
Amount of pending item metadata consuming RAM in this bucket and should be transient during rebalancing + - `gauge.bucket.op.vb_pending_num` (*gauge*)
Number of pending items + - `gauge.bucket.op.vb_pending_num_non_resident` (*gauge*)
Number of non resident vBuckets in the pending state for this bucket + - `gauge.bucket.op.vb_pending_ops_create` (*gauge*)
New items per second being instead into pending vBuckets in this bucket and should be transient during rebalancing + - `gauge.bucket.op.vb_pending_ops_update` (*gauge*)
Number of items updated on pending vBucket per second for this bucket + - `gauge.bucket.op.vb_pending_queue_age` (*gauge*)
Sum of disk pending queue item age in milliseconds + - `gauge.bucket.op.vb_pending_queue_drain` (*gauge*)
Number of pending items per second being written to disk in this bucket and should be transient during rebalancing + - `gauge.bucket.op.vb_pending_queue_fill` (*gauge*)
Total enqueued pending items on disk queue + - `gauge.bucket.op.vb_pending_queue_size` (*gauge*)
Number of pending items waiting to be written to disk in this bucket and should be transient during rebalancing + - `gauge.bucket.op.vb_pending_resident_items_ratio` (*gauge*)
Number of resident pending items + - `gauge.bucket.op.vb_replica_curr_items` (*gauge*)
Number of in memory items + - `gauge.bucket.op.vb_replica_eject` (*gauge*)
Number of items per second being ejected to disk from replica vBuckets + - `gauge.bucket.op.vb_replica_itm_memory` (*gauge*)
Amount of replica user data cached in RAM in this bucket + - `gauge.bucket.op.vb_replica_meta_data_memory` (*gauge*)
Amount of replica item metadata consuming in RAM in this bucket + - `gauge.bucket.op.vb_replica_num` (*gauge*)
Number of replica vBuckets + - `gauge.bucket.op.vb_replica_num_non_resident` (*gauge*)
Number of non resident vBuckets in the replica state for this bucket + - `gauge.bucket.op.vb_replica_ops_create` (*gauge*)
Number of replica create operations + - `gauge.bucket.op.vb_replica_ops_update` (*gauge*)
Number of items updated on replica vBucket per second for this bucket + - `gauge.bucket.op.vb_replica_queue_age` (*gauge*)
Sum of disk replica queue item age in milliseconds + - `gauge.bucket.op.vb_replica_queue_drain` (*gauge*)
Total drained replica items in the queue + - `gauge.bucket.op.vb_replica_queue_fill` (*gauge*)
Number of replica items per second being put on the replica item disk queue in this bucket + - `gauge.bucket.op.vb_replica_queue_size` (*gauge*)
Number of replica items in disk queue + - `gauge.bucket.op.vb_replica_resident_items_ratio` (*gauge*)
Percentage of replica items cached in RAM in this bucket. + - `gauge.bucket.op.vb_total_queue_age` (*gauge*)
Sum of disk queue item age in milliseconds + - `gauge.bucket.op.xdc_ops` (*gauge*)
Cross datacenter replication operations per second for this bucket + - `gauge.bucket.quota.ram` (*gauge*)
Amount of RAM used by the bucket (bytes) + - `gauge.bucket.quota.rawRAM` (*gauge*)
Amount of raw RAM used by the bucket (bytes) #### Group nodes All of the following metrics are part of the `nodes` metric group. All of the non-default metrics below can be turned on by adding `nodes` to the monitor config option `extraGroups`: - ***`gauge.nodes.cmd_get`*** (*gauge*)
Number of get commands - - ***`gauge.nodes.couch_docs_actual_disk_size`*** (*gauge*)
Amount of disk space used by Couch docs.(bytes) + - ***`gauge.nodes.couch_docs_actual_disk_size`*** (*gauge*)
Amount of disk space used by Couch docs (bytes) - ***`gauge.nodes.couch_docs_data_size`*** (*gauge*)
Data size of couch documents associated with a node (bytes) - `gauge.nodes.couch_spatial_data_size` (*gauge*)
Size of object data for spatial views (bytes) - - `gauge.nodes.couch_spatial_disk_size` (*gauge*)
Amount of disk space occupied by spatial views, in bytes. - - `gauge.nodes.couch_views_actual_disk_size` (*gauge*)
Amount of disk space occupied by Couch views (bytes). - - `gauge.nodes.couch_views_data_size` (*gauge*)
Size of object data for Couch views (bytes). + - `gauge.nodes.couch_spatial_disk_size` (*gauge*)
Amount of disk space occupied by spatial views, in bytes + - `gauge.nodes.couch_views_actual_disk_size` (*gauge*)
Amount of disk space occupied by Couch views (bytes) + - `gauge.nodes.couch_views_data_size` (*gauge*)
Size of object data for Couch views (bytes) - `gauge.nodes.curr_items` (*gauge*)
Number of current items - ***`gauge.nodes.curr_items_tot`*** (*gauge*)
Total number of items associated with node - ***`gauge.nodes.ep_bg_fetched`*** (*gauge*)
Number of disk fetches performed since server was started - `gauge.nodes.get_hits` (*gauge*)
Number of get hits - - `gauge.nodes.mcdMemoryAllocated` (*gauge*)
Amount of memcached memory allocated (bytes). - - `gauge.nodes.mcdMemoryReserved` (*gauge*)
Amount of memcached memory reserved (bytes). + - `gauge.nodes.mcdMemoryAllocated` (*gauge*)
Amount of memcached memory allocated (bytes) + - `gauge.nodes.mcdMemoryReserved` (*gauge*)
Amount of memcached memory reserved (bytes) - ***`gauge.nodes.mem_used`*** (*gauge*)
Memory used by the node (bytes) - - `gauge.nodes.memoryFree` (*gauge*)
Amount of memory free for the node (bytes). - - `gauge.nodes.memoryTotal` (*gauge*)
Total memory available to the node (bytes). + - `gauge.nodes.memoryFree` (*gauge*)
Amount of memory free for the node (bytes) + - `gauge.nodes.memoryTotal` (*gauge*)
Total memory available to the node (bytes) - ***`gauge.nodes.ops`*** (*gauge*)
Number of operations performed on Couchbase - ***`gauge.nodes.system.cpu_utilization_rate`*** (*gauge*)
The CPU utilization rate (%) - ***`gauge.nodes.system.mem_free`*** (*gauge*)
Free memory available to the node (bytes) @@ -164,14 +348,11 @@ monitor config option `extraGroups`: - `gauge.storage.ram.quotaUsed` (*gauge*)
Ram quota used by the cluster (bytes) - `gauge.storage.ram.quotaUsedPerNode` (*gauge*)
Ram quota used per node (bytes) - `gauge.storage.ram.total` (*gauge*)
Total ram available to cluster (bytes) - - `gauge.storage.ram.used` (*gauge*)
Ram used by the cluster (bytes) + - `gauge.storage.ram.used` (*gauge*)
RAM used by the cluster (bytes) - `gauge.storage.ram.usedByData` (*gauge*)
Ram used by the data in the cluster (bytes) ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -180,19 +361,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-cpu.md b/signalfx-agent/agent_docs/monitors/collectd-cpu.md index 8646d59e6..d58484601 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-cpu.md +++ b/signalfx-agent/agent_docs/monitors/collectd-cpu.md @@ -4,7 +4,7 @@ # collectd/cpu -Monitor Type: `collectd/cpu` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/cpu)) +Monitor Type: `collectd/cpu` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/cpu)) **Accepts Endpoints**: No @@ -12,6 +12,10 @@ Monitor Type: `collectd/cpu` ([Source](https://github.com/signalfx/signalfx-agen ## Overview +**This monitor is deprecated in favor of the `cpu` monitor. Please switch +to that monitor, as this monitor will be removed in a future agent +release.** + This monitor collects cpu usage data using the collectd `cpu` plugin. It aggregates the per-core CPU data into a single metric and sends it to the SignalFx Metadata plugin in collectd, where the @@ -64,9 +68,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -75,19 +76,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-cpufreq.md b/signalfx-agent/agent_docs/monitors/collectd-cpufreq.md index 7e62bb4e5..885a64901 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-cpufreq.md +++ b/signalfx-agent/agent_docs/monitors/collectd-cpufreq.md @@ -4,7 +4,7 @@ # collectd/cpufreq -Monitor Type: `collectd/cpufreq` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/cpufreq)) +Monitor Type: `collectd/cpufreq` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/cpufreq)) **Accepts Endpoints**: No @@ -42,7 +42,6 @@ This monitor emits all metrics by default; however, **none are categorized as -- they are all custom**. - - ***`cpufreq.`*** (*gauge*)
The processor frequency in Hertz for the th processor on the system. The agent does not do any built-in filtering of metrics coming out of this monitor. diff --git a/signalfx-agent/agent_docs/monitors/collectd-custom.md b/signalfx-agent/agent_docs/monitors/collectd-custom.md index 7255fc4d6..403dc4e23 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-custom.md +++ b/signalfx-agent/agent_docs/monitors/collectd-custom.md @@ -4,7 +4,7 @@ # collectd/custom -Monitor Type: `collectd/custom` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/custom)) +Monitor Type: `collectd/custom` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/custom)) **Accepts Endpoints**: **Yes** diff --git a/signalfx-agent/agent_docs/monitors/collectd-df.md b/signalfx-agent/agent_docs/monitors/collectd-df.md index bfc165c28..5b9f235fb 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-df.md +++ b/signalfx-agent/agent_docs/monitors/collectd-df.md @@ -4,7 +4,7 @@ # collectd/df -Monitor Type: `collectd/df` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/df)) +Monitor Type: `collectd/df` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/df)) **Accepts Endpoints**: No @@ -80,9 +80,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -91,19 +88,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-disk.md b/signalfx-agent/agent_docs/monitors/collectd-disk.md index 02307164b..9adb535dd 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-disk.md +++ b/signalfx-agent/agent_docs/monitors/collectd-disk.md @@ -4,7 +4,7 @@ # collectd/disk -Monitor Type: `collectd/disk` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/disk)) +Monitor Type: `collectd/disk` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/disk)) **Accepts Endpoints**: No @@ -67,9 +67,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -78,19 +75,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-elasticsearch.md b/signalfx-agent/agent_docs/monitors/collectd-elasticsearch.md index 581d90d8d..14e448ea4 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-elasticsearch.md +++ b/signalfx-agent/agent_docs/monitors/collectd-elasticsearch.md @@ -4,7 +4,7 @@ # collectd/elasticsearch -Monitor Type: `collectd/elasticsearch` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/elasticsearch)) +Monitor Type: `collectd/elasticsearch` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/elasticsearch)) **Accepts Endpoints**: **Yes** @@ -240,9 +240,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -251,19 +248,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-etcd.md b/signalfx-agent/agent_docs/monitors/collectd-etcd.md index 33bc01a5c..f73bf08d3 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-etcd.md +++ b/signalfx-agent/agent_docs/monitors/collectd-etcd.md @@ -4,7 +4,7 @@ # collectd/etcd -Monitor Type: `collectd/etcd` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/etcd)) +Monitor Type: `collectd/etcd` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/etcd)) **Accepts Endpoints**: **Yes** @@ -85,9 +85,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -96,19 +93,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-genericjmx.md b/signalfx-agent/agent_docs/monitors/collectd-genericjmx.md index 152bf658f..5eb24f891 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-genericjmx.md +++ b/signalfx-agent/agent_docs/monitors/collectd-genericjmx.md @@ -4,7 +4,7 @@ # collectd/genericjmx -Monitor Type: `collectd/genericjmx` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/genericjmx)) +Monitor Type: `collectd/genericjmx` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/genericjmx)) **Accepts Endpoints**: **Yes** @@ -172,6 +172,8 @@ Metrics that are categorized as [container/host](https://docs.splunk.com/observability/admin/subscription-usage/monitor-imm-billing-usage.html#about-custom-bundled-and-high-resolution-metrics) (*default*) are ***in bold and italics*** in the list below. +This monitor will also emit by default any metrics that are not listed below. + #### Group jvm All of the following metrics are part of the `jvm` metric group. All of @@ -188,9 +190,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -199,19 +198,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-hadoop.md b/signalfx-agent/agent_docs/monitors/collectd-hadoop.md index ac4f26fb8..8215827b8 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-hadoop.md +++ b/signalfx-agent/agent_docs/monitors/collectd-hadoop.md @@ -4,7 +4,7 @@ # collectd/hadoop -Monitor Type: `collectd/hadoop` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/hadoop)) +Monitor Type: `collectd/hadoop` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/hadoop)) **Accepts Endpoints**: **Yes** @@ -295,9 +295,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -306,19 +303,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-hadoopjmx.md b/signalfx-agent/agent_docs/monitors/collectd-hadoopjmx.md index fcbb0843a..d99d95b58 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-hadoopjmx.md +++ b/signalfx-agent/agent_docs/monitors/collectd-hadoopjmx.md @@ -4,7 +4,7 @@ # collectd/hadoopjmx -Monitor Type: `collectd/hadoopjmx` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/hadoopjmx)) +Monitor Type: `collectd/hadoopjmx` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/hadoopjmx)) **Accepts Endpoints**: **Yes** @@ -14,7 +14,7 @@ Monitor Type: `collectd/hadoopjmx` ([Source](https://github.com/signalfx/signalf Collects metrics about a Hadoop 2.0+ cluster using using collectd's GenericJMX plugin. You may also configure the -[collectd/hadoop](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/collectd-hadoop.md) +[collectd/hadoop](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/collectd-hadoop.md) monitor to collect additional metrics about the hadoop cluster from the REST API @@ -33,10 +33,10 @@ export YARN_RESOURCEMANAGER_OPTS="-Dcom.sun.management.jmxremote.ssl=false -Dcom ``` This monitor has a set of built in MBeans configured for: - - [Name Nodes](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/hadoopjmx/nameNodeMBeans.go) - - [Resource Manager](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/hadoopjmx/resourceManagerMBeans.go) - - [Node Manager](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/hadoopjmx/nodeManagerMBeans.go) - - [Data Nodes](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/hadoopjmx/dataNodeMBeans.go) + - [Name Nodes](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/hadoopjmx/nameNodeMBeans.go) + - [Resource Manager](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/hadoopjmx/resourceManagerMBeans.go) + - [Node Manager](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/hadoopjmx/nodeManagerMBeans.go) + - [Data Nodes](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/hadoopjmx/dataNodeMBeans.go) Sample YAML configuration: @@ -139,6 +139,8 @@ Metrics that are categorized as [container/host](https://docs.splunk.com/observability/admin/subscription-usage/monitor-imm-billing-usage.html#about-custom-bundled-and-high-resolution-metrics) (*default*) are ***in bold and italics*** in the list below. +This monitor will also emit by default any metrics that are not listed below. + #### Group data-node All of the following metrics are part of the `data-node` metric group. All of @@ -228,9 +230,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -239,19 +238,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-health-checker.md b/signalfx-agent/agent_docs/monitors/collectd-health-checker.md index e9f65cb14..37a0da0f8 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-health-checker.md +++ b/signalfx-agent/agent_docs/monitors/collectd-health-checker.md @@ -4,7 +4,7 @@ # collectd/health-checker -Monitor Type: `collectd/health-checker` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/healthchecker)) +Monitor Type: `collectd/health-checker` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/healthchecker)) **Accepts Endpoints**: **Yes** diff --git a/signalfx-agent/agent_docs/monitors/collectd-interface.md b/signalfx-agent/agent_docs/monitors/collectd-interface.md index 3b760a86f..d85c2a91b 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-interface.md +++ b/signalfx-agent/agent_docs/monitors/collectd-interface.md @@ -4,7 +4,7 @@ # collectd/interface -Monitor Type: `collectd/interface` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/netinterface)) +Monitor Type: `collectd/interface` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/netinterface)) **Accepts Endpoints**: No @@ -16,6 +16,12 @@ Collectd stats about network interfaces on the system by using the [collectd interface plugin](https://collectd.org/wiki/index.php/Plugin:Interface). +**This monitor is deprecated in favor of the `net-io` monitor. Please +switch to that monitor as this monitor will be removed in a future release +of the agent.** Note that the `net-io` monitor uses the `interface` +dimension to identify the network card instead of the `plugin_instance` +dimension, but otherwise the metrics are the same. + ## Configuration @@ -57,9 +63,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -68,19 +71,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-jenkins.md b/signalfx-agent/agent_docs/monitors/collectd-jenkins.md index 6f27045b0..3ccdcc171 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-jenkins.md +++ b/signalfx-agent/agent_docs/monitors/collectd-jenkins.md @@ -4,7 +4,7 @@ # collectd/jenkins -Monitor Type: `collectd/jenkins` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/jenkins)) +Monitor Type: `collectd/jenkins` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/jenkins)) **Accepts Endpoints**: **Yes** @@ -16,9 +16,9 @@ Monitors jenkins by using the [jenkins collectd Python plugin](https://github.com/signalfx/collectd-jenkins), which collects metrics from Jenkins instances by hitting these endpoints: -[../api/json](https://wiki.jenkins.io/display/jenkins/remote+access+api) +[../api/json](https://www.jenkins.io/doc/book/using/remote-access-api/) (job metrics) and -[metrics/<MetricsKey>/..](https://wiki.jenkins.io/display/JENKINS/Metrics+Plugin) +[metrics/<MetricsKey>/..](https://plugins.jenkins.io/metrics/) (default and optional Codahale/Dropwizard JVM metrics). Requires Jenkins 1.580.3 or later, as well as the Jenkins Metrics Plugin (see Setup). @@ -87,6 +87,7 @@ Configuration](../monitor-config.html#common-configuration).** | `pythonBinary` | no | `string` | Path to a python binary that should be used to execute the Python code. If not set, a built-in runtime will be used. Can include arguments to the binary as well. | | `host` | **yes** | `string` | | | `port` | **yes** | `integer` | | +| `path` | no | `string` | | | `metricsKey` | **yes** | `string` | Key required for collecting metrics. The access key located at `Manage Jenkins > Configure System > Metrics > ADD.` If empty, click `Generate`. | | `enhancedMetrics` | no | `bool` | Whether to enable enhanced metrics (**default:** `false`) | | `excludeJobMetrics` | no | `bool` | Set to *true* to exclude job metrics retrieved from `/api/json` endpoint (**default:** `false`) | @@ -122,33 +123,7 @@ This monitor emits all metrics by default; however, **none are categorized as - ***`gauge.jenkins.node.vm.memory.heap.usage`*** (*gauge*)
Percent utilization of the heap memory - ***`gauge.jenkins.node.vm.memory.non-heap.used`*** (*gauge*)
Total amount of non-heap memory used - ***`gauge.jenkins.node.vm.memory.total.used`*** (*gauge*)
Total Memory used by instance - -### Non-default metrics (version 4.7.0+) - -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - -To emit metrics that are not _default_, you can add those metrics in the -generic monitor-level `extraMetrics` config option. Metrics that are derived -from specific configuration options that do not appear in the above list of -metrics do not need to be added to `extraMetrics`. - -To see a list of metrics that will be emitted you can run `agent-status -monitors` after configuring this monitor in a running agent instance. - -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - +The agent does not do any built-in filtering of metrics coming out of this +monitor. diff --git a/signalfx-agent/agent_docs/monitors/collectd-kafka.md b/signalfx-agent/agent_docs/monitors/collectd-kafka.md index c70903812..338075ee3 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-kafka.md +++ b/signalfx-agent/agent_docs/monitors/collectd-kafka.md @@ -4,7 +4,7 @@ # collectd/kafka -Monitor Type: `collectd/kafka` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/kafka)) +Monitor Type: `collectd/kafka` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/kafka)) **Accepts Endpoints**: **Yes** @@ -19,7 +19,7 @@ how to configure custom MBeans, as well as information on troubleshooting JMX setup. This monitor has a set of [built in MBeans -configured](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/kafka/mbeans.go) +configured](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/kafka/mbeans.go) for which it pulls metrics from Kafka's JMX endpoint. Note that this monitor supports Kafka v0.8.2.x and above. For Kafka v1.x.x and above, @@ -94,6 +94,8 @@ Metrics that are categorized as [container/host](https://docs.splunk.com/observability/admin/subscription-usage/monitor-imm-billing-usage.html#about-custom-bundled-and-high-resolution-metrics) (*default*) are ***in bold and italics*** in the list below. +This monitor will also emit by default any metrics that are not listed below. + - ***`counter.kafka-bytes-in`*** (*cumulative*)
Number of bytes received per second across all topics - ***`counter.kafka-bytes-out`*** (*cumulative*)
Number of bytes transmitted per second across all topics @@ -135,9 +137,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -146,19 +145,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-kafka_consumer.md b/signalfx-agent/agent_docs/monitors/collectd-kafka_consumer.md index 55795497a..6506e3240 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-kafka_consumer.md +++ b/signalfx-agent/agent_docs/monitors/collectd-kafka_consumer.md @@ -4,7 +4,7 @@ # collectd/kafka_consumer -Monitor Type: `collectd/kafka_consumer` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/kafkaconsumer)) +Monitor Type: `collectd/kafka_consumer` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/kafkaconsumer)) **Accepts Endpoints**: **Yes** @@ -15,7 +15,7 @@ Monitor Type: `collectd/kafka_consumer` ([Source](https://github.com/signalfx/si Monitors a Java based Kafka consumer using [collectd's GenericJMX plugin](./collectd-genericjmx.md). This monitor has a set of [built in MBeans -configured](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/kafkaconsumer/mbeans.go) +configured](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/kafkaconsumer/mbeans.go) for which it pulls metrics from the Kafka consumer's JMX endpoint. Sample YAML configuration: @@ -33,7 +33,7 @@ Also, per-topic metrics that are collected by default are not available through v0.9.0.0 which can cause the logs to flood with warnings related to the MBean not being found. Use the `mBeansToOmit` config option in such cases. The above example configuration will not attempt to collect the MBean referenced by `fetch-size-avg-per-topic`. Here is a -[list](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/kafkaconsumer/mbeans.go) +[list](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/kafkaconsumer/mbeans.go) of metrics collected by default. @@ -98,6 +98,8 @@ Metrics that are categorized as [container/host](https://docs.splunk.com/observability/admin/subscription-usage/monitor-imm-billing-usage.html#about-custom-bundled-and-high-resolution-metrics) (*default*) are ***in bold and italics*** in the list below. +This monitor will also emit by default any metrics that are not listed below. + - ***`gauge.kafka.consumer.bytes-consumed-rate`*** (*gauge*)
Average number of bytes consumed per second. This metric has either client-id dimension or, both client-id and topic dimensions. The former is an aggregate across all topics of the latter. - ***`gauge.kafka.consumer.fetch-rate`*** (*gauge*)
Number of records consumed per second. @@ -120,9 +122,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -131,19 +130,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-kafka_producer.md b/signalfx-agent/agent_docs/monitors/collectd-kafka_producer.md index d6d87dbb1..8cd0fdc9e 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-kafka_producer.md +++ b/signalfx-agent/agent_docs/monitors/collectd-kafka_producer.md @@ -4,7 +4,7 @@ # collectd/kafka_producer -Monitor Type: `collectd/kafka_producer` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/kafkaproducer)) +Monitor Type: `collectd/kafka_producer` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/kafkaproducer)) **Accepts Endpoints**: **Yes** @@ -15,7 +15,7 @@ Monitor Type: `collectd/kafka_producer` ([Source](https://github.com/signalfx/si Monitors a Java based Kafka producer using GenericJMX. This monitor has a set of [built in MBeans -configured](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/kafkaproducer/mbeans.go) +configured](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/kafkaproducer/mbeans.go) for which it pulls metrics from the Kafka producer's JMX endpoint. Sample YAML configuration: @@ -90,6 +90,8 @@ Metrics that are categorized as [container/host](https://docs.splunk.com/observability/admin/subscription-usage/monitor-imm-billing-usage.html#about-custom-bundled-and-high-resolution-metrics) (*default*) are ***in bold and italics*** in the list below. +This monitor will also emit by default any metrics that are not listed below. + - ***`gauge.kafka.producer.byte-rate`*** (*gauge*)
Average number of bytes sent per second for a topic. This metric has client-id and topic dimensions. - ***`gauge.kafka.producer.compression-rate`*** (*gauge*)
Average compression rate of record batches for a topic. This metric has client-id and topic dimensions. @@ -117,9 +119,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -128,19 +127,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-kong.md b/signalfx-agent/agent_docs/monitors/collectd-kong.md index a411bfe80..d9a1e24f1 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-kong.md +++ b/signalfx-agent/agent_docs/monitors/collectd-kong.md @@ -4,7 +4,7 @@ # collectd/kong -Monitor Type: `collectd/kong` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/kong)) +Monitor Type: `collectd/kong` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/kong)) **Accepts Endpoints**: **Yes** @@ -56,7 +56,7 @@ This plugin requires: | Software | Version | |-------------------|----------------| -| Kong | 0.11.2+ | +| Kong Community Edition (CE) | 0.11.2+ | | Configured [kong-plugin-signalfx](https://github.com/signalfx/kong-plugin-signalfx) | 0.0.1+ | @@ -77,7 +77,7 @@ monitors: report: false ``` -Sample YAML configuration with custom /signalfx route and white and blacklists +Sample YAML configuration with custom /signalfx route and filter lists ```yaml monitors: @@ -197,9 +197,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -208,19 +205,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-load.md b/signalfx-agent/agent_docs/monitors/collectd-load.md index d87406705..a763d9dca 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-load.md +++ b/signalfx-agent/agent_docs/monitors/collectd-load.md @@ -4,7 +4,7 @@ # collectd/load -Monitor Type: `collectd/load` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/load)) +Monitor Type: `collectd/load` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/load)) **Accepts Endpoints**: No @@ -44,15 +44,12 @@ Metrics that are categorized as (*default*) are ***in bold and italics*** in the list below. - - ***`load.longterm`*** (*gauge*)
Average CPU load per core over the last 15 minutes - - ***`load.midterm`*** (*gauge*)
Average CPU load per core over the last five minutes - - ***`load.shortterm`*** (*gauge*)
Average CPU load per core over the last one minute + - ***`load.longterm`*** (*gauge*)
Average CPU load for the whole system over the last 15 minutes + - ***`load.midterm`*** (*gauge*)
Average CPU load for the whole system over the last five minutes + - ***`load.shortterm`*** (*gauge*)
Average CPU load for the whole system over the last one minute ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -61,19 +58,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-marathon.md b/signalfx-agent/agent_docs/monitors/collectd-marathon.md index cdd0f748b..aef6f35bd 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-marathon.md +++ b/signalfx-agent/agent_docs/monitors/collectd-marathon.md @@ -4,7 +4,7 @@ # collectd/marathon -Monitor Type: `collectd/marathon` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/marathon)) +Monitor Type: `collectd/marathon` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/marathon)) **Accepts Endpoints**: **Yes** @@ -75,51 +75,25 @@ This monitor emits all metrics by default; however, **none are categorized as -- they are all custom**. - - ***`gauge.marathon.app.cpu.allocated`*** (*gauge*)
Number of CPUs allocated to an application - - ***`gauge.marathon.app.cpu.allocated.per.instance`*** (*gauge*)
Configured number of CPUs allocated to each application instance - - `gauge.marathon.app.delayed` (*gauge*)
Indicates if the application is delayed or not - - `gauge.marathon.app.deployments.total` (*gauge*)
Number of application deployments - - ***`gauge.marathon.app.disk.allocated`*** (*gauge*)
Storage allocated to a Marathon application - - ***`gauge.marathon.app.disk.allocated.per.instance`*** (*gauge*)
Configured storage allocated each to application instance - - `gauge.marathon.app.gpu.allocated` (*gauge*)
GPU Allocated to a Marathon application - - `gauge.marathon.app.gpu.allocated.per.instance` (*gauge*)
Configured number of GPUs allocated to each application instance - - ***`gauge.marathon.app.instances.total`*** (*gauge*)
Number of application instances - - ***`gauge.marathon.app.memory.allocated`*** (*gauge*)
Memory Allocated to a Marathon application - - ***`gauge.marathon.app.memory.allocated.per.instance`*** (*gauge*)
Configured amount of memory allocated to each application instance - - ***`gauge.marathon.app.tasks.running`*** (*gauge*)
Number tasks running for an application - - ***`gauge.marathon.app.tasks.staged`*** (*gauge*)
Number tasks staged for an application - - ***`gauge.marathon.app.tasks.unhealthy`*** (*gauge*)
Number unhealthy tasks for an application - - ***`gauge.marathon.task.healthchecks.failing.total`*** (*gauge*)
The number of failing health checks for a task - - ***`gauge.marathon.task.healthchecks.passing.total`*** (*gauge*)
The number of passing health checks for a task - - `gauge.marathon.task.staged.time.elapsed` (*gauge*)
The amount of time the task spent in staging - - `gauge.marathon.task.start.time.elapsed` (*gauge*)
Time elapsed since the task started - -### Non-default metrics (version 4.7.0+) - -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - -To emit metrics that are not _default_, you can add those metrics in the -generic monitor-level `extraMetrics` config option. Metrics that are derived -from specific configuration options that do not appear in the above list of -metrics do not need to be added to `extraMetrics`. - -To see a list of metrics that will be emitted you can run `agent-status -monitors` after configuring this monitor in a running agent instance. - -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - + - ***`gauge.service.mesosphere.marathon.app.cpu.allocated`*** (*gauge*)
Number of CPUs allocated to an application + - ***`gauge.service.mesosphere.marathon.app.cpu.allocated.per.instance`*** (*gauge*)
Configured number of CPUs allocated to each application instance + - ***`gauge.service.mesosphere.marathon.app.delayed`*** (*gauge*)
Indicates if the application is delayed or not + - ***`gauge.service.mesosphere.marathon.app.deployments.total`*** (*gauge*)
Number of application deployments + - ***`gauge.service.mesosphere.marathon.app.disk.allocated`*** (*gauge*)
Storage allocated to a Marathon application + - ***`gauge.service.mesosphere.marathon.app.disk.allocated.per.instance`*** (*gauge*)
Configured storage allocated each to application instance + - ***`gauge.service.mesosphere.marathon.app.gpu.allocated`*** (*gauge*)
GPU Allocated to a Marathon application + - ***`gauge.service.mesosphere.marathon.app.gpu.allocated.per.instance`*** (*gauge*)
Configured number of GPUs allocated to each application instance + - ***`gauge.service.mesosphere.marathon.app.instances.total`*** (*gauge*)
Number of application instances + - ***`gauge.service.mesosphere.marathon.app.memory.allocated`*** (*gauge*)
Memory Allocated to a Marathon application + - ***`gauge.service.mesosphere.marathon.app.memory.allocated.per.instance`*** (*gauge*)
Configured amount of memory allocated to each application instance + - ***`gauge.service.mesosphere.marathon.app.tasks.running`*** (*gauge*)
Number tasks running for an application + - ***`gauge.service.mesosphere.marathon.app.tasks.staged`*** (*gauge*)
Number tasks staged for an application + - ***`gauge.service.mesosphere.marathon.app.tasks.unhealthy`*** (*gauge*)
Number unhealthy tasks for an application + - ***`gauge.service.mesosphere.marathon.task.healthchecks.failing.total`*** (*gauge*)
The number of failing health checks for a task + - ***`gauge.service.mesosphere.marathon.task.healthchecks.passing.total`*** (*gauge*)
The number of passing health checks for a task + - ***`gauge.service.mesosphere.marathon.task.staged.time.elapsed`*** (*gauge*)
The amount of time the task spent in staging + - ***`gauge.service.mesosphere.marathon.task.start.time.elapsed`*** (*gauge*)
Time elapsed since the task started +The agent does not do any built-in filtering of metrics coming out of this +monitor. diff --git a/signalfx-agent/agent_docs/monitors/collectd-memcached.md b/signalfx-agent/agent_docs/monitors/collectd-memcached.md index b8823a826..67a2728c0 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-memcached.md +++ b/signalfx-agent/agent_docs/monitors/collectd-memcached.md @@ -4,7 +4,7 @@ # collectd/memcached -Monitor Type: `collectd/memcached` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/memcached)) +Monitor Type: `collectd/memcached` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/memcached)) **Accepts Endpoints**: **Yes** @@ -85,9 +85,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -96,19 +93,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-memory.md b/signalfx-agent/agent_docs/monitors/collectd-memory.md index d2c2c7221..156ea59df 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-memory.md +++ b/signalfx-agent/agent_docs/monitors/collectd-memory.md @@ -4,7 +4,7 @@ # collectd/memory -Monitor Type: `collectd/memory` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/memory)) +Monitor Type: `collectd/memory` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/memory)) **Accepts Endpoints**: No @@ -49,9 +49,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -60,19 +57,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-mongodb.md b/signalfx-agent/agent_docs/monitors/collectd-mongodb.md index 78e9e71c5..0fd981794 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-mongodb.md +++ b/signalfx-agent/agent_docs/monitors/collectd-mongodb.md @@ -4,7 +4,7 @@ # collectd/mongodb -Monitor Type: `collectd/mongodb` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/mongodb)) +Monitor Type: `collectd/mongodb` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/mongodb)) **Accepts Endpoints**: **Yes** @@ -109,7 +109,7 @@ Metrics that are categorized as - `counter.asserts.regular` (*cumulative*)
The number of regular assertions raised since the MongoDB process started. Check the log file for more information about these messages. - `counter.asserts.warning` (*cumulative*)
In MongoDB 3.x and earlier, the field returns the number of warnings raised since the MongoDB process started. In MongodDB 4, this is always 0. - - ***`counter.backgroundFlushing.flushes`*** (*gauge*)
Number of times the database has been flushed + - ***`counter.backgroundFlushing.flushes`*** (*gauge*)
Number of times the database has been flushed. Only available when MMAPv1 is enabled. (MMAPv1 is not supported in MongoDB version > 4.2) - ***`counter.extra_info.page_faults`*** (*gauge*)
Mongod page faults - `counter.lock.Database.acquireCount.intentExclusive` (*cumulative*)
- `counter.lock.Database.acquireCount.intentShared` (*cumulative*)
@@ -133,16 +133,19 @@ Metrics that are categorized as - `counter.opcountersRepl.insert` (*cumulative*)
Number of replicated inserts since last restart - `counter.opcountersRepl.query` (*cumulative*)
Number of replicated queries since last restart - `counter.opcountersRepl.update` (*cumulative*)
Number of replicated updates since last restart - - ***`gauge.backgroundFlushing.average_ms`*** (*gauge*)
Average time (ms) to write data to disk - - ***`gauge.backgroundFlushing.last_ms`*** (*gauge*)
Most recent time (ms) spent writing data to disk + - ***`gauge.backgroundFlushing.average_ms`*** (*gauge*)
Average time (ms) to write data to disk. Only available when MMAPv1 is enabled. (MMAPv1 is not supported in MongoDB version > 4.2) + - ***`gauge.backgroundFlushing.last_ms`*** (*gauge*)
Most recent time (ms) spent writing data to disk. Only available when MMAPv1 is enabled. (MMAPv1 is not supported in MongoDB version > 4.2) - `gauge.collection.max` (*gauge*)
Maximum number of documents in a capped collection - `gauge.collection.maxSize` (*gauge*)
Maximum disk usage of a capped collection - `gauge.collections` (*gauge*)
Number of collections - - `gauge.connections.available` (*gauge*)
Number of available incoming connections - - ***`gauge.connections.current`*** (*gauge*)
Number of current client connections + - `gauge.connections.available` (*gauge*)
The number of unused incoming connections available. Consider this value + in combination with the value of `gauge.connections.current` to + understand the connection load on the database. + + - ***`gauge.connections.current`*** (*gauge*)
The number of incoming connections from clients to the database server. - `gauge.connections.totalCreated` (*cumulative*)
Count of all incoming connections created to the server. This number includes connections that have since closed. - ***`gauge.dataSize`*** (*gauge*)
Total size of data, in bytes - - ***`gauge.extra_info.heap_usage_bytes`*** (*gauge*)
Heap size used by the mongod process, in bytes + - ***`gauge.extra_info.heap_usage_bytes`*** (*gauge*)
Heap size used by the mongod process, in bytes. Deprecated in mongo version > 3.3, use gauge.tcmalloc.generic.heap_size instead. - ***`gauge.globalLock.activeClients.readers`*** (*gauge*)
Number of active client connections performing reads - `gauge.globalLock.activeClients.total` (*gauge*)
Total number of active client connections - ***`gauge.globalLock.activeClients.writers`*** (*gauge*)
Number of active client connections performing writes @@ -151,12 +154,19 @@ Metrics that are categorized as - ***`gauge.globalLock.currentQueue.writers`*** (*gauge*)
Write operations currently in queue - ***`gauge.indexSize`*** (*gauge*)
Total size of indexes, in bytes - `gauge.indexes` (*gauge*)
Number of indexes across all collections - - ***`gauge.mem.mapped`*** (*gauge*)
Mongodb mapped memory usage, in MB + - ***`gauge.mem.mapped`*** (*gauge*)
Mongodb mapped memory usage, in MB. Only available when MMAPv1 is enabled. (MMAPv1 is not supported in MongoDB version > 4.2) - ***`gauge.mem.resident`*** (*gauge*)
Mongodb resident memory usage, in MB - ***`gauge.mem.virtual`*** (*gauge*)
Mongodb virtual memory usage, in MB - `gauge.numExtents` (*gauge*)
- ***`gauge.objects`*** (*gauge*)
Number of documents across all collections + - ***`gauge.repl.active_nodes`*** (*gauge*)
Number of healthy members in a replicaset (reporting 1 for [health](https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#replSetGetStatus.members[n].health)). + - ***`gauge.repl.is_primary_node`*** (*gauge*)
Report 1 when member [state](https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#replSetGetStatus.members[n].stateStr) of replicaset is `PRIMARY` and 2 else. + - ***`gauge.repl.max_lag`*** (*gauge*)
Replica lag in seconds calculated from the difference between the + timestamp of the last oplog entry of primary and secondary [see mongo + doc](https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#replSetGetStatus.members[n].optimeDate). + - ***`gauge.storageSize`*** (*gauge*)
Total bytes allocated to collections for document storage + - ***`gauge.tcmalloc.generic.heap_size`*** (*gauge*)
Heap size used by the mongod process, in bytes. Same as gauge.extra_info.heap_usage_bytes but supports 64-bit values. - ***`gauge.uptime`*** (*counter*)
Uptime of this server in milliseconds #### Group collection @@ -195,9 +205,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -206,20 +213,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/collectd-mysql.md b/signalfx-agent/agent_docs/monitors/collectd-mysql.md index 40e0c5e78..85c2b9e8e 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-mysql.md +++ b/signalfx-agent/agent_docs/monitors/collectd-mysql.md @@ -4,7 +4,7 @@ # collectd/mysql -Monitor Type: `collectd/mysql` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/mysql)) +Monitor Type: `collectd/mysql` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/mysql)) **Accepts Endpoints**: **Yes** @@ -42,6 +42,10 @@ databases being monitored, you can specify that in the top-level `username`/`password` options, otherwise they can be specified at the database level. + +### InnoDB metrics +If you want to enable InnoDB metrics (`innodbStats` to `true`), be sure that +you granted to your user the `PROCESS` privilege. ### Example Config @@ -87,6 +91,7 @@ Configuration](../monitor-config.html#common-configuration).** | `username` | no | `string` | These credentials serve as defaults for all databases if not overridden | | `password` | no | `string` | | | `reportHost` | no | `bool` | A SignalFx extension to the plugin that allows us to disable the normal behavior of the MySQL collectd plugin where the `host` dimension is set to the hostname of the MySQL database server. When `false` (the recommended and default setting), the globally configured `hostname` config is used instead. (**default:** `false`) | +| `innodbStats` | no | `bool` | (**default:** `false`) | The **nested** `databases` config object has the following fields: @@ -112,6 +117,21 @@ Metrics that are categorized as - `cache_result.qcache-not_cached` (*cumulative*)
The number of MySQL queries that were not cacheable or not cached. - `cache_result.qcache-prunes` (*cumulative*)
The number of queries that were pruned from query cache because of low-memory condition. - ***`cache_size.qcache`*** (*gauge*)
The number of queries in MySQL query cache. + - `mysql_bpool_bytes.data` (*gauge*)
The total number of bytes in the InnoDB buffer pool containing data. The number includes both dirty and clean pages. + - `mysql_bpool_bytes.dirty` (*gauge*)
The total current number of bytes held in dirty pages in the InnoDB buffer pool. + - `mysql_bpool_counters.pages_flushed` (*cumulative*)
The number of requests to flush pages from the InnoDB buffer pool. + - `mysql_bpool_counters.read_ahead` (*cumulative*)
The number of pages read into the InnoDB buffer pool by the read-ahead background thread. + - `mysql_bpool_counters.read_ahead_evicted` (*cumulative*)
The number of pages read into the InnoDB buffer pool by the read-ahead background thread that were subsequently evicted without having been accessed by queries. + - `mysql_bpool_counters.read_ahead_rnd` (*cumulative*)
The number of “random” read-aheads initiated by InnoDB. This happens when a query scans a large portion of a table but in random order. + - `mysql_bpool_counters.read_requests` (*cumulative*)
The number of logical read requests. + - `mysql_bpool_counters.reads` (*cumulative*)
The number of logical reads that InnoDB could not satisfy from the buffer pool, and had to read directly from disk. + - `mysql_bpool_counters.wait_free` (*cumulative*)
Normally, writes to the InnoDB buffer pool happen in the background. When InnoDB needs to read or create a page and no clean pages are available, InnoDB flushes some dirty pages first and waits for that operation to finish. This counter counts instances of these waits. + - `mysql_bpool_counters.write_requests` (*cumulative*)
The number of writes done to the InnoDB buffer pool. + - `mysql_bpool_pages.data` (*gauge*)
The number of pages in the InnoDB buffer pool containing data. The number includes both dirty and clean pages. + - `mysql_bpool_pages.dirty` (*gauge*)
The current number of dirty pages in the InnoDB buffer pool. + - `mysql_bpool_pages.free` (*gauge*)
The number of free pages in the InnoDB buffer pool. + - `mysql_bpool_pages.misc` (*gauge*)
The number of pages in the InnoDB buffer pool that are busy because they have been allocated for administrative overhead, such as row locks or the adaptive hash index. + - `mysql_bpool_pages.total` (*gauge*)
The total size of the InnoDB buffer pool, in pages. - `mysql_commands.admin_commands` (*cumulative*)
The number of MySQL ADMIN commands executed - `mysql_commands.alter_db` (*cumulative*)
The number of MySQL ALTER DB commands executed - `mysql_commands.alter_db_upgrade` (*cumulative*)
The number of MySQL ALTER DB UPGRADE commands executed @@ -261,6 +281,27 @@ Metrics that are categorized as - `mysql_handler.savepoint_rollback` (*cumulative*)
The number of requests to roll back to a savepoint. - `mysql_handler.update` (*cumulative*)
The number of requests to update a row in a table. - `mysql_handler.write` (*cumulative*)
The number of requests to insert a row in a table. + - `mysql_innodb_data.fsyncs` (*cumulative*)
The number of fsync() operations so far. + - `mysql_innodb_data.read` (*cumulative*)
The amount of data read since the server was started (in bytes). + - `mysql_innodb_data.reads` (*cumulative*)
The total number of data reads (OS file reads). + - `mysql_innodb_data.writes` (*cumulative*)
The total number of data writes. + - `mysql_innodb_data.written` (*cumulative*)
The amount of data written so far, in bytes. + - `mysql_innodb_dblwr.writes` (*cumulative*)
The number of doublewrite operations that have been performed. + - `mysql_innodb_dblwr.written` (*cumulative*)
The number of pages that have been written to the doublewrite buffer. + - `mysql_innodb_log.fsyncs` (*cumulative*)
The number of fsync() writes done to the InnoDB redo log files. + - `mysql_innodb_log.waits` (*cumulative*)
The number of times that the log buffer was too small and a wait was required for it to be flushed before continuing. + - `mysql_innodb_log.write_requests` (*cumulative*)
The number of write requests for the InnoDB redo log. + - `mysql_innodb_log.writes` (*cumulative*)
The number of physical writes to the InnoDB redo log file. + - `mysql_innodb_log.written` (*cumulative*)
The number of bytes written to the InnoDB redo log files. + - `mysql_innodb_pages.created` (*cumulative*)
The number of pages created by operations on InnoDB tables. + - `mysql_innodb_pages.read` (*cumulative*)
The number of pages read from the InnoDB buffer pool by operations on InnoDB tables. + - `mysql_innodb_pages.written` (*cumulative*)
The number of pages written by operations on InnoDB tables. + - `mysql_innodb_row_lock.time` (*cumulative*)
The total time spent in acquiring row locks for InnoDB tables, in milliseconds. + - `mysql_innodb_row_lock.waits` (*cumulative*)
The number of times operations on InnoDB tables had to wait for a row lock. + - `mysql_innodb_rows.deleted` (*cumulative*)
The number of rows deleted from InnoDB tables. + - `mysql_innodb_rows.inserted` (*cumulative*)
The number of rows inserted into InnoDB tables. + - `mysql_innodb_rows.read` (*cumulative*)
The number of rows read from InnoDB tables. + - `mysql_innodb_rows.updated` (*cumulative*)
The number of rows updated in InnoDB tables. - ***`mysql_locks.immediate`*** (*cumulative*)
The number of MySQL table locks which were granted immediately. - ***`mysql_locks.waited`*** (*cumulative*)
The number of MySQL table locks which had to wait before being granted. - ***`mysql_octets.rx`*** (*cumulative*)
The number of bytes received by MySQL server from all clients. @@ -282,9 +323,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -293,19 +331,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-nginx.md b/signalfx-agent/agent_docs/monitors/collectd-nginx.md index d1ec99229..424640983 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-nginx.md +++ b/signalfx-agent/agent_docs/monitors/collectd-nginx.md @@ -4,7 +4,7 @@ # collectd/nginx -Monitor Type: `collectd/nginx` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/nginx)) +Monitor Type: `collectd/nginx` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/nginx)) **Accepts Endpoints**: **Yes** @@ -73,9 +73,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -84,19 +81,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-openstack.md b/signalfx-agent/agent_docs/monitors/collectd-openstack.md index 9fdd4531f..b4a3b58dc 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-openstack.md +++ b/signalfx-agent/agent_docs/monitors/collectd-openstack.md @@ -4,7 +4,7 @@ # collectd/openstack -Monitor Type: `collectd/openstack` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/openstack)) +Monitor Type: `collectd/openstack` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/openstack)) **Accepts Endpoints**: No @@ -45,6 +45,21 @@ monitors: authURL: "http://192.168.11.111/identity/v3" username: "admin" password: "secret" + requestBatchSize: 10 + novaListServersSearchOpts: + all_tenants: "TRUE" + status: "ACTIVE" +``` +### Example config using skipVerify and disabling querying server metrics +```yaml +monitors: +- type: collectd/openstack + authURL: "https://192.168.11.111/identity/v3" + username: "admin" + password: "secret" + skipVerify: true + queryServerMetrics: false + queryHypervisorMetrics: false ``` @@ -71,7 +86,14 @@ Configuration](../monitor-config.html#common-configuration).** | `password` | **yes** | `string` | Password to authenticate with keystone identity | | `projectName` | no | `string` | Specify the name of Project to be monitored (**default**:"demo") | | `projectDomainID` | no | `string` | The project domain (**default**:"default") | +| `regionName` | no | `string` | The region name for URL discovery, defaults to the first region if multiple regions are available. | | `userDomainID` | no | `string` | The user domain id (**default**:"default") | +| `skipVerify` | no | `bool` | Skip SSL certificate validation (**default:** `false`) | +| `httpTimeout` | no | `float64` | The HTTP client timeout in seconds for all requests (**default:** `0`) | +| `requestBatchSize` | no | `integer` | The maximum number of concurrent requests for each metric class (**default:** `5`) | +| `queryServerMetrics` | no | `bool` | Whether to query server metrics (useful to disable for TripleO Undercloud) (**default:** `true`) | +| `queryHypervisorMetrics` | no | `bool` | Whether to query hypervisor metrics (useful to disable for TripleO Undercloud) (**default:** `true`) | +| `novaListServersSearchOpts` | no | `map of strings` | Optional search_opts mapping for collectd-openstack Nova client servers.list(search_opts=novaListServerSearchOpts). For more information see https://docs.openstack.org/api-ref/compute/#list-servers. | ## Metrics @@ -140,9 +162,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -151,20 +170,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/collectd-postgresql.md b/signalfx-agent/agent_docs/monitors/collectd-postgresql.md index 6db8d10de..b56ebfbd3 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-postgresql.md +++ b/signalfx-agent/agent_docs/monitors/collectd-postgresql.md @@ -4,7 +4,7 @@ # collectd/postgresql -Monitor Type: `collectd/postgresql` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/postgresql)) +Monitor Type: `collectd/postgresql` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/postgresql)) **Accepts Endpoints**: **Yes** @@ -172,9 +172,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -183,19 +180,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-processes.md b/signalfx-agent/agent_docs/monitors/collectd-processes.md index a3807dc40..859bfbe92 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-processes.md +++ b/signalfx-agent/agent_docs/monitors/collectd-processes.md @@ -4,7 +4,7 @@ # collectd/processes -Monitor Type: `collectd/processes` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/processes)) +Monitor Type: `collectd/processes` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/processes)) **Accepts Endpoints**: No @@ -70,7 +70,6 @@ This monitor emits all metrics by default; however, **none are categorized as -- they are all custom**. - - ***`disk_octets.read`*** (*cumulative*)
- ***`disk_octets.write`*** (*cumulative*)
- ***`fork_rate`*** (*cumulative*)
diff --git a/signalfx-agent/agent_docs/monitors/collectd-protocols.md b/signalfx-agent/agent_docs/monitors/collectd-protocols.md index faa0116ae..de5b4bbf1 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-protocols.md +++ b/signalfx-agent/agent_docs/monitors/collectd-protocols.md @@ -4,7 +4,7 @@ # collectd/protocols -Monitor Type: `collectd/protocols` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/protocols)) +Monitor Type: `collectd/protocols` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/protocols)) **Accepts Endpoints**: No @@ -52,9 +52,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -63,19 +60,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-python.md b/signalfx-agent/agent_docs/monitors/collectd-python.md index d2da75894..c812186d4 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-python.md +++ b/signalfx-agent/agent_docs/monitors/collectd-python.md @@ -4,7 +4,7 @@ # collectd/python -Monitor Type: `collectd/python` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/python)) +Monitor Type: `collectd/python` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/python)) **Accepts Endpoints**: **Yes** diff --git a/signalfx-agent/agent_docs/monitors/collectd-rabbitmq.md b/signalfx-agent/agent_docs/monitors/collectd-rabbitmq.md index 18e6e3a0d..12cbf24ff 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-rabbitmq.md +++ b/signalfx-agent/agent_docs/monitors/collectd-rabbitmq.md @@ -4,7 +4,7 @@ # collectd/rabbitmq -Monitor Type: `collectd/rabbitmq` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/rabbitmq)) +Monitor Type: `collectd/rabbitmq` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/rabbitmq)) **Accepts Endpoints**: **Yes** @@ -54,6 +54,12 @@ Configuration](../monitor-config.html#common-configuration).** | `verbosityLevel` | no | `string` | | | `username` | **yes** | `string` | | | `password` | **yes** | `string` | | +| `useHTTPS` | no | `bool` | Whether to enable HTTPS. (**default:** `false`) | +| `sslCACertFile` | no | `string` | Path to SSL/TLS certificates file of root Certificate Authorities implicitly trusted by this monitor. | +| `sslCertFile` | no | `string` | Path to this monitor's own SSL/TLS certificate. | +| `sslKeyFile` | no | `string` | Path to this monitor's private SSL/TLS key file. | +| `sslKeyPassphrase` | no | `string` | This monitor's private SSL/TLS key file password if any. | +| `sslVerify` | no | `bool` | Should the monitor verify the RabbitMQ Management plugin SSL/TLS certificate. (**default:** `false`) | ## Metrics @@ -200,9 +206,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -211,19 +214,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-redis.md b/signalfx-agent/agent_docs/monitors/collectd-redis.md index c3e1f0dca..cc6efe035 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-redis.md +++ b/signalfx-agent/agent_docs/monitors/collectd-redis.md @@ -4,7 +4,7 @@ # collectd/redis -Monitor Type: `collectd/redis` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/redis)) +Monitor Type: `collectd/redis` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/redis)) **Accepts Endpoints**: **Yes** @@ -20,12 +20,12 @@ You can capture any kind of Redis metrics like: * Memory used * Commands processed per second - * Number of connected clients and slaves + * Number of connected clients and followers * Number of blocked clients * Number of keys stored (per database) * Uptime * Changes since last save - * Replication delay (per slave) + * Replication delay (per follower) @@ -48,7 +48,7 @@ match something that is very big, as this command is not highly optimized and can block other commands from executing. Note: To avoid duplication reporting, this should only be reported in one node. -Keys can be defined in either the master or slave config. +Keys can be defined in either the leader or follower config. Sample YAML configuration with list lengths: @@ -96,6 +96,7 @@ Configuration](../monitor-config.html#common-configuration).** | `name` | no | `string` | The name for the node is a canonical identifier which is used as plugin instance. It is limited to 64 characters in length. (**default**: "{host}:{port}") | | `auth` | no | `string` | Password to use for authentication. | | `sendListLengths` | no | `list of objects (see below)` | Specify a pattern of keys to lists for which to send their length as a metric. See below for more details. | +| `verbose` | no | `bool` | If `true`, verbose logging from the plugin will be enabled. (**default:** `false`) | The **nested** `sendListLengths` config object has the following fields: @@ -114,6 +115,8 @@ Metrics that are categorized as (*default*) are ***in bold and italics*** in the list below. + - `bytes.maxmemory` (*gauge*)
Maximum memory configured on Redis server + - `bytes.total_system_memory` (*gauge*)
Total memory available on the OS - ***`bytes.used_memory`*** (*gauge*)
Number of bytes allocated by Redis - `bytes.used_memory_lua` (*gauge*)
Number of bytes used by the Lua engine - `bytes.used_memory_peak` (*gauge*)
Peak Number of bytes allocated by Redis @@ -136,18 +139,21 @@ Metrics that are categorized as - `gauge.changes_since_last_save` (*gauge*)
Number of changes since the last dump - `gauge.client_biggest_input_buf` (*gauge*)
Biggest input buffer among current client connections - `gauge.client_longest_output_list` (*gauge*)
Longest output list among current client connections - - ***`gauge.connected_clients`*** (*gauge*)
Number of client connections (excluding connections from slaves) - - `gauge.connected_slaves` (*gauge*)
Number of connected slaves + - ***`gauge.connected_clients`*** (*gauge*)
Number of client connections (excluding connections from followers) + - `gauge.connected_slaves` (*gauge*)
Number of connected followers - `gauge.db0_avg_ttl` (*gauge*)
The average time to live for all keys in redis - `gauge.db0_expires` (*gauge*)
The total number of keys in redis that will expire - `gauge.db0_keys` (*gauge*)
The total number of keys stored in redis - `gauge.instantaneous_ops_per_sec` (*gauge*)
Number of commands processed per second - `gauge.key_llen` (*gauge*)
Length of an list key - `gauge.latest_fork_usec` (*gauge*)
Duration of the latest fork operation in microseconds - - `gauge.master_last_io_seconds_ago` (*gauge*)
Number of seconds since the last interaction with master + - `gauge.master_last_io_seconds_ago` (*gauge*)
Number of seconds since the last interaction with leader + - `gauge.master_link_down_since_seconds` (*gauge*)
Number of seconds since the link is down + - `gauge.master_link_status` (*gauge*)
Status of the link (up/down) - ***`gauge.master_repl_offset`*** (*gauge*)
Master replication offset - `gauge.mem_fragmentation_ratio` (*gauge*)
Ratio between used_memory_rss and used_memory - `gauge.rdb_bgsave_in_progress` (*gauge*)
Flag indicating a RDB save is on-going + - `gauge.rdb_last_save_time` (*gauge*)
Unix timestamp for last save to disk, when using persistence - `gauge.repl_backlog_first_byte_offset` (*gauge*)
Slave replication backlog offset - ***`gauge.slave_repl_offset`*** (*gauge*)
Slave replication offset - `gauge.uptime_in_days` (*gauge*)
Number of days up @@ -155,9 +161,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -166,20 +169,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/collectd-signalfx-metadata.md b/signalfx-agent/agent_docs/monitors/collectd-signalfx-metadata.md index 73c30be96..31dbf7d0e 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-signalfx-metadata.md +++ b/signalfx-agent/agent_docs/monitors/collectd-signalfx-metadata.md @@ -4,7 +4,7 @@ # collectd/signalfx-metadata -Monitor Type: `collectd/signalfx-metadata` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/metadata)) +Monitor Type: `collectd/signalfx-metadata` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/metadata)) **Accepts Endpoints**: No @@ -71,9 +71,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -82,19 +79,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-solr.md b/signalfx-agent/agent_docs/monitors/collectd-solr.md index cfc319e65..24d1a1e0e 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-solr.md +++ b/signalfx-agent/agent_docs/monitors/collectd-solr.md @@ -4,7 +4,7 @@ # collectd/solr -Monitor Type: `collectd/solr` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/solr)) +Monitor Type: `collectd/solr` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/solr)) **Accepts Endpoints**: **Yes** @@ -69,6 +69,8 @@ Metrics that are categorized as [container/host](https://docs.splunk.com/observability/admin/subscription-usage/monitor-imm-billing-usage.html#about-custom-bundled-and-high-resolution-metrics) (*default*) are ***in bold and italics*** in the list below. +This monitor will also emit by default any metrics that are not listed below. + - ***`counter.solr.http_2xx_responses`*** (*counter*)
Total number of 2xx http responses - ***`counter.solr.http_4xx_responses`*** (*counter*)
Total number of 4xx http responses @@ -121,9 +123,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -132,19 +131,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-spark.md b/signalfx-agent/agent_docs/monitors/collectd-spark.md index faadc2ccb..a8908b100 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-spark.md +++ b/signalfx-agent/agent_docs/monitors/collectd-spark.md @@ -4,7 +4,7 @@ # collectd/spark -Monitor Type: `collectd/spark` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/spark)) +Monitor Type: `collectd/spark` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/spark)) **Accepts Endpoints**: **Yes** @@ -12,28 +12,30 @@ Monitor Type: `collectd/spark` ([Source](https://github.com/signalfx/signalfx-ag ## Overview -Collects metrics about a Spark cluster using the [collectd Spark Python +This integration collects metrics about a Spark cluster using the [collectd Spark Python plugin](https://github.com/signalfx/collectd-spark). That plugin collects metrics from Spark cluster and instances by hitting endpoints specified in Spark's [Monitoring and Instrumentation documentation](https://spark.apache.org/docs/latest/monitoring.html) under `REST API` and `Metrics`. -We currently only support cluster modes Standalone, Mesos, and Hadoop Yarn -via HTTP endpoints. +The following cluster modes are supported only through HTTP endpoints: +- Standalone +- Mesos +- Hadoop YARN -You have to specify distinct monitor configurations and discovery rules for +You must specify distinct monitor configurations and discovery rules for master and worker processes. For the master configuration, set `isMaster` to true. -When running Spark on Apache Hadoop / Yarn, this integration is only capable -of reporting application metrics from the master node. Please use the +When running Spark on Apache Hadoop / YARN, this integration is only capable +of reporting application metrics from the master node. Use the collectd/hadoop monitor to report on the health of the cluster. ### Example config: -An example configuration for monitoring applications on Yarn +An example configuration for monitoring applications on YARN ```yaml monitors: - type: collectd/spark @@ -172,9 +174,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -183,20 +182,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/collectd-statsd.md b/signalfx-agent/agent_docs/monitors/collectd-statsd.md index bb0b7f597..88d24e19e 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-statsd.md +++ b/signalfx-agent/agent_docs/monitors/collectd-statsd.md @@ -4,7 +4,7 @@ # collectd/statsd -Monitor Type: `collectd/statsd` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/statsd)) +Monitor Type: `collectd/statsd` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/statsd)) **Accepts Endpoints**: No diff --git a/signalfx-agent/agent_docs/monitors/collectd-systemd.md b/signalfx-agent/agent_docs/monitors/collectd-systemd.md index 58ff636c3..b3618b09e 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-systemd.md +++ b/signalfx-agent/agent_docs/monitors/collectd-systemd.md @@ -4,7 +4,7 @@ # collectd/systemd -Monitor Type: `collectd/systemd` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/systemd)) +Monitor Type: `collectd/systemd` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/systemd)) **Accepts Endpoints**: **Yes** @@ -83,20 +83,19 @@ Metrics that are categorized as (*default*) are ***in bold and italics*** in the list below. - - - ***`gauge.active_state.activating`*** (*gauge*)
Indicates that the systemd unit/service has previously been inactive but is currently in the process of entering an active state - - ***`gauge.active_state.active`*** (*gauge*)
Indicates that the systemd unit/service is active - - ***`gauge.active_state.deactivating`*** (*gauge*)
Indicates that the systemd unit/service is currently in the process of deactivation - - ***`gauge.active_state.failed`*** (*gauge*)
Indicates that the systemd unit/service is inactive the previous run was not successful - - ***`gauge.active_state.inactive`*** (*gauge*)
Indicates that the systemd unit/service is inactive and the previous run was successful or no previous run has taken place yet - - ***`gauge.active_state.reloading`*** (*gauge*)
Indicates that the systemd unit/service is active and currently reloading its configuration - - ***`gauge.load_state.error`*** (*gauge*)
Indicates that the systemd unit/service configuration failed to load - - ***`gauge.load_state.loaded`*** (*gauge*)
Indicates that the systemd unit/service configuration was loaded and parsed successfully - - ***`gauge.load_state.masked`*** (*gauge*)
Indicates that the systemd unit/service is currently masked out (i.e. symlinked to /dev/null etc) - - ***`gauge.load_state.not-found`*** (*gauge*)
Indicates that the systemd unit/service configuration was not found - - ***`gauge.substate.dead`*** (*gauge*)
Indicates that the systemd unit/service died - - ***`gauge.substate.exited`*** (*gauge*)
Indicates that the systemd unit/service exited - - ***`gauge.substate.failed`*** (*gauge*)
Indicates that the systemd unit/service failed + - `gauge.active_state.activating` (*gauge*)
Indicates that the systemd unit/service has previously been inactive but is currently in the process of entering an active state + - `gauge.active_state.active` (*gauge*)
Indicates that the systemd unit/service is active + - `gauge.active_state.deactivating` (*gauge*)
Indicates that the systemd unit/service is currently in the process of deactivation + - `gauge.active_state.failed` (*gauge*)
Indicates that the systemd unit/service is inactive the previous run was not successful + - `gauge.active_state.inactive` (*gauge*)
Indicates that the systemd unit/service is inactive and the previous run was successful or no previous run has taken place yet + - `gauge.active_state.reloading` (*gauge*)
Indicates that the systemd unit/service is active and currently reloading its configuration + - `gauge.load_state.error` (*gauge*)
Indicates that the systemd unit/service configuration failed to load + - `gauge.load_state.loaded` (*gauge*)
Indicates that the systemd unit/service configuration was loaded and parsed successfully + - `gauge.load_state.masked` (*gauge*)
Indicates that the systemd unit/service is currently masked out (i.e. symlinked to /dev/null etc) + - `gauge.load_state.not-found` (*gauge*)
Indicates that the systemd unit/service configuration was not found + - `gauge.substate.dead` (*gauge*)
Indicates that the systemd unit/service died + - `gauge.substate.exited` (*gauge*)
Indicates that the systemd unit/service exited + - `gauge.substate.failed` (*gauge*)
Indicates that the systemd unit/service failed - ***`gauge.substate.running`*** (*gauge*)
Indicates that the systemd unit/service is running #### Group ActiveState @@ -113,8 +112,17 @@ monitor config option `extraGroups`: All of the following metrics are part of the `SubState` metric group. All of the non-default metrics below can be turned on by adding `SubState` to the monitor config option `extraGroups`: -The agent does not do any built-in filtering of metrics coming out of this -monitor. + +### Non-default metrics (version 4.7.0+) + +To emit metrics that are not _default_, you can add those metrics in the +generic monitor-level `extraMetrics` config option. Metrics that are derived +from specific configuration options that do not appear in the above list of +metrics do not need to be added to `extraMetrics`. + +To see a list of metrics that will be emitted you can run `agent-status +monitors` after configuring this monitor in a running agent instance. + ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/collectd-uptime.md b/signalfx-agent/agent_docs/monitors/collectd-uptime.md index d14625b46..57530b94b 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-uptime.md +++ b/signalfx-agent/agent_docs/monitors/collectd-uptime.md @@ -4,7 +4,7 @@ # collectd/uptime -Monitor Type: `collectd/uptime` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/uptime)) +Monitor Type: `collectd/uptime` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/uptime)) **Accepts Endpoints**: No @@ -45,9 +45,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -56,19 +53,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-vmem.md b/signalfx-agent/agent_docs/monitors/collectd-vmem.md index e6c870b93..0be16dd1c 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-vmem.md +++ b/signalfx-agent/agent_docs/monitors/collectd-vmem.md @@ -4,7 +4,7 @@ # collectd/vmem -Monitor Type: `collectd/vmem` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/vmem)) +Monitor Type: `collectd/vmem` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/vmem)) **Accepts Endpoints**: No @@ -17,6 +17,10 @@ subsystem of the kernel using the [collectd vmem plugin](https://collectd.org/wiki/index.php/Plugin:vmem). There is no configuration available for this plugin. +**This monitor is deprecated in favor of the `vmem` monitor. The metrics +should be fully compatible with this monitor.** This monitor will be +removed in a future agent release. + ## Configuration @@ -54,9 +58,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -65,19 +66,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/collectd-zookeeper.md b/signalfx-agent/agent_docs/monitors/collectd-zookeeper.md index 505aed061..be08a7110 100644 --- a/signalfx-agent/agent_docs/monitors/collectd-zookeeper.md +++ b/signalfx-agent/agent_docs/monitors/collectd-zookeeper.md @@ -4,7 +4,7 @@ # collectd/zookeeper -Monitor Type: `collectd/zookeeper` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/collectd/zookeeper)) +Monitor Type: `collectd/zookeeper` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/collectd/zookeeper)) **Accepts Endpoints**: **Yes** @@ -66,10 +66,15 @@ Metrics that are categorized as - ***`gauge.zk_watch_count`*** (*gauge*)
Number of watches placed on Z-Nodes on a ZooKeeper server - ***`gauge.zk_znode_count`*** (*gauge*)
Number of z-nodes that a ZooKeeper server has in its data tree -### Non-default metrics (version 4.7.0+) +#### Group leader +All of the following metrics are part of the `leader` metric group. All of +the non-default metrics below can be turned on by adding `leader` to the +monitor config option `extraGroups`: + - `gauge.zk_followers` (*gauge*)
Number of followers within the ensemble. Only exposed by the leader. + - `gauge.zk_pending_syncs` (*gauge*)
Number of pending syncs from the followers. Only exposed by the leader. + - `gauge.zk_synced_followers` (*gauge*)
Number of synced followers. Only exposed by the leader. -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** +### Non-default metrics (version 4.7.0+) To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived @@ -79,19 +84,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/conviva.md b/signalfx-agent/agent_docs/monitors/conviva.md index 8aff0cba4..af8506325 100644 --- a/signalfx-agent/agent_docs/monitors/conviva.md +++ b/signalfx-agent/agent_docs/monitors/conviva.md @@ -4,7 +4,7 @@ # conviva -Monitor Type: `conviva` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/conviva)) +Monitor Type: `conviva` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/conviva)) **Accepts Endpoints**: No @@ -361,9 +361,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -372,19 +369,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/coredns.md b/signalfx-agent/agent_docs/monitors/coredns.md index 4ac98234f..e56cbc9c3 100644 --- a/signalfx-agent/agent_docs/monitors/coredns.md +++ b/signalfx-agent/agent_docs/monitors/coredns.md @@ -4,7 +4,7 @@ # coredns -Monitor Type: `coredns` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/coredns)) +Monitor Type: `coredns` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/coredns)) **Accepts Endpoints**: **Yes** @@ -46,9 +46,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -68,10 +69,11 @@ Metrics that are categorized as - `coredns_build_info` (*gauge*)
A metric with a constant '1' value labeled by version, revision, and goversion from which CoreDNS was built. + - ***`coredns_cache_entries`*** (*cumulative*)
Size of DNS cache. - `coredns_cache_hits_total` (*cumulative*)
The count of cache misses. - `coredns_cache_misses_total` (*cumulative*)
The count of cache misses. - - ***`coredns_cache_size`*** (*cumulative*)
Size of DNS cache. - - ***`coredns_dns_request_count_total`*** (*cumulative*)
Counter of DNS requests made per zone, protocol and family. + - ***`coredns_cache_size`*** (*cumulative*)
Deprecated in coredns version 1.7.0. Size of DNS cache. + - ***`coredns_dns_request_count_total`*** (*cumulative*)
Deprecated in coredns version 1.7.0. Counter of DNS requests made per zone, protocol and family. - `coredns_dns_request_duration_seconds` (*cumulative*)
Histogram of the time (in seconds) each request took. (sum) - `coredns_dns_request_duration_seconds_bucket` (*cumulative*)
Histogram of the time (in seconds) each request took. (bucket) - `coredns_dns_request_duration_seconds_count` (*cumulative*)
Histogram of the time (in seconds) each request took. (count) @@ -79,14 +81,17 @@ Metrics that are categorized as - `coredns_dns_request_size_bytes_bucket` (*cumulative*)
Size of the EDNS0 UDP buffer in bytes (64K for TCP). (bucket) - `coredns_dns_request_size_bytes_count` (*cumulative*)
Size of the EDNS0 UDP buffer in bytes (64K for TCP). (count) - ***`coredns_dns_request_type_count_total`*** (*cumulative*)
Counter of DNS requests per type, per zone. - - ***`coredns_dns_response_rcode_count_total`*** (*cumulative*)
Counter of response status codes. + - ***`coredns_dns_requests_total`*** (*cumulative*)
Counter of DNS requests made per zone, protocol and family. + - ***`coredns_dns_response_rcode_count_total`*** (*cumulative*)
Deprecated in coredns version 1.7.0. Counter of response status codes. - `coredns_dns_response_size_bytes` (*cumulative*)
Size of the returned response in bytes. (sum) - `coredns_dns_response_size_bytes_bucket` (*cumulative*)
Size of the returned response in bytes. (bucket) - `coredns_dns_response_size_bytes_count` (*cumulative*)
Size of the returned response in bytes. (count) + - ***`coredns_dns_responses_total`*** (*cumulative*)
Counter of response status codes. - `coredns_health_request_duration_seconds` (*cumulative*)
Histogram of the time (in seconds) each request took. (sum) - `coredns_health_request_duration_seconds_bucket` (*cumulative*)
Histogram of the time (in seconds) each request took. (bucket) - `coredns_health_request_duration_seconds_count` (*cumulative*)
Histogram of the time (in seconds) each request took. (count) - - `coredns_panic_count_total` (*cumulative*)
A metrics that counts the number of panics. + - `coredns_panic_count_total` (*cumulative*)
Deprecated in coredns version 1.7.0. A metrics that counts the number of panics. + - `coredns_panics_total` (*cumulative*)
A metrics that counts the number of panics. - `coredns_proxy_request_count_total` (*cumulative*)
Counter of requests made per protocol, proxy protocol, family and upstream. - `coredns_proxy_request_duration_seconds` (*cumulative*)
Histogram of the time (in seconds) each request took. (sum) - `coredns_proxy_request_duration_seconds_bucket` (*cumulative*)
Histogram of the time (in seconds) each request took. (bucket) @@ -127,9 +132,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -138,19 +140,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/cpu.md b/signalfx-agent/agent_docs/monitors/cpu.md index 4eeee4b4d..dd2596a26 100644 --- a/signalfx-agent/agent_docs/monitors/cpu.md +++ b/signalfx-agent/agent_docs/monitors/cpu.md @@ -4,7 +4,7 @@ # cpu -Monitor Type: `cpu` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/cpu)) +Monitor Type: `cpu` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/cpu)) **Accepts Endpoints**: No @@ -40,7 +40,11 @@ monitors: # All monitor config goes under this key Configuration](../monitor-config.html#common-configuration).** -This monitor has no configuration options. +| Config option | Required | Type | Description | +| --- | --- | --- | --- | +| `reportPerCPU` | no | `bool` | If `true`, stats will be generated for the system as a whole _as well as_ for each individual CPU/core in the system and will be distinguished by the `cpu` dimension. If `false`, stats will only be generated for the system as a whole that will not include a `cpu` dimension. (**default:** `false`) | + + ## Metrics These are the metrics available for this monitor. @@ -49,13 +53,27 @@ Metrics that are categorized as (*default*) are ***in bold and italics*** in the list below. - - ***`cpu.utilization`*** (*gauge*)
Percent of CPU used on this host. This metric is emitted with a plugin dimension set to "signalfx-metadata". - - `cpu.utilization_per_core` (*gauge*)
Percent of CPU used on each core. This metric is emitted with the plugin dimension set to "signalfx-metadata" + - ***`cpu.idle`*** (*cumulative*)
CPU time spent not in any other state. In order to get a percentage this value must be compared against the sum of all CPU states. -### Non-default metrics (version 4.7.0+) + - `cpu.interrupt` (*cumulative*)
CPU time spent while servicing hardware interrupts. A hardware interrupt happens at the physical layer. When this occurs, the CPU will stop whatever else it is doing and service the interrupt. This metric measures how many jiffies were spent handling these interrupts. In order to get a percentage this value must be compared against the sum of all CPU states. A sustained high value for this metric may be caused by faulty hardware such as a broken peripheral. + + - `cpu.nice` (*cumulative*)
CPU time spent in userspace running 'nice'-ed processes. In order to get a percentage this value must be compared against the sum of all CPU states. A sustained high value for this metric may be caused by: 1) The server not having enough CPU capacity for a process, 2) A programming error which causes a process to use an unexpected amount of CPU + + - ***`cpu.num_processors`*** (*gauge*)
The number of logical processors on the host. + - `cpu.softirq` (*cumulative*)
CPU time spent while servicing software interrupts. Unlike a hardware interrupt, a software interrupt happens at the sofware layer. Usually it is a userspace program requesting a service of the kernel. This metric measures how many jiffies were spent by the CPU handling these interrupts. In order to get a percentage this value must be compared against the sum of all CPU states. A sustained high value for this metric may be caused by a programming error which causes a process to unexpectedly request too many services from the kernel. + + - `cpu.steal` (*cumulative*)
CPU time spent waiting for a hypervisor to service requests from other virtual machines. This metric is only present on virtual machines. This metric records how much time this virtual machine had to wait to have the hypervisor kernel service a request. In order to get a percentage this value must be compared against the sum of all CPU states. A sustained high value for this metric may be caused by: 1) Another VM on the same hypervisor using too many resources, or 2) An underpowered hypervisor -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** + - `cpu.system` (*cumulative*)
CPU time spent running in the kernel. This value reflects how often processes are calling into the kernel for services (e.g to log to the console). In order to get a percentage this value must be compared against the sum of all CPU states. A sustained high value for this metric may be caused by: 1) A process that needs to be re-written to use kernel resources more efficiently, or 2) A userspace driver that is broken + + - `cpu.user` (*cumulative*)
CPU time spent running in userspace. In order to get a percentage this value must be compared against the sum of all CPU states. If this value is high: 1) A process requires more CPU to run than is available on the server, or 2) There is an application programming error which is causing the CPU to be used unexpectedly. + + - ***`cpu.utilization`*** (*gauge*)
Percent of CPU used on this host. + - `cpu.utilization_per_core` (*gauge*)
Percent of CPU used on each core + - `cpu.wait` (*cumulative*)
Amount of total CPU time spent idle while waiting for an I/O operation to complete. In order to get a percentage this value must be compared against the sum of all CPU states. A high value for a sustained period may be caused by: 1) A slow hardware device that is taking too long to service requests, or 2) Too many requests being sent to an I/O device + + +### Non-default metrics (version 4.7.0+) To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived @@ -65,19 +83,14 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) +## Dimensions -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** +The following dimensions may occur on metrics emitted by this monitor. Some +dimensions may be specific to certain metrics. -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. +| Name | Description | +| --- | --- | +| `cpu` | The number/id of the core/cpu on the system. Only present if `reportPerCPU: true`. | diff --git a/signalfx-agent/agent_docs/monitors/disk-io.md b/signalfx-agent/agent_docs/monitors/disk-io.md index 47b04394a..7df0607cd 100644 --- a/signalfx-agent/agent_docs/monitors/disk-io.md +++ b/signalfx-agent/agent_docs/monitors/disk-io.md @@ -4,7 +4,7 @@ # disk-io -Monitor Type: `disk-io` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/diskio)) +Monitor Type: `disk-io` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/diskio)) **Accepts Endpoints**: No @@ -63,6 +63,7 @@ Metrics that are categorized as - `disk_octets.write` (*cumulative*)
(Linux Only) The number of bytes (octets) written to a disk. - ***`disk_ops.avg_read`*** (*gauge*)
(Windows Only) The average disk read queue length. - ***`disk_ops.avg_write`*** (*gauge*)
(Windows Only) The average disk write queue length. + - `disk_ops.pending` (*gauge*)
Number of pending operations - ***`disk_ops.read`*** (*cumulative*)
(Linux Only) The number of disk read operations. - ***`disk_ops.total`*** (*gauge*)
(Linux Only) The number of both read and write disk operations across all disks in the last reporting interval. - ***`disk_ops.write`*** (*cumulative*)
(Linux Only) The number of disk write operations. @@ -73,9 +74,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -84,20 +82,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/docker-container-stats.md b/signalfx-agent/agent_docs/monitors/docker-container-stats.md index 8f9c99e90..04c85c48c 100644 --- a/signalfx-agent/agent_docs/monitors/docker-container-stats.md +++ b/signalfx-agent/agent_docs/monitors/docker-container-stats.md @@ -4,7 +4,7 @@ # docker-container-stats -Monitor Type: `docker-container-stats` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/docker)) +Monitor Type: `docker-container-stats` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/docker)) **Accepts Endpoints**: No @@ -54,6 +54,7 @@ Configuration](../monitor-config.html#common-configuration).** | `enableExtraNetworkMetrics` | no | `bool` | Whether it will send all extra network metrics as well. (**default:** `false`) | | `dockerURL` | no | `string` | The URL of the docker server (**default:** `unix:///var/run/docker.sock`) | | `timeoutSeconds` | no | `integer` | The maximum amount of time to wait for docker API requests (**default:** `5`) | +| `cacheSyncInterval` | no | `int64` | The time to wait before resyncing the list of containers the monitor maintains through the docker event listener example: cacheSyncInterval: "20m" (**default:** `60m`) | | `labelsToDimensions` | no | `map of strings` | A mapping of container label names to dimension names. The corresponding label values will become the dimension value for the mapped name. E.g. `io.kubernetes.container.name: container_spec_name` would result in a dimension called `container_spec_name` that has the value of the `io.kubernetes.container.name` container label. | | `envToDimensions` | no | `map of strings` | A mapping of container environment variable names to dimension names. The corresponding env var values become the dimension values on the emitted metrics. E.g. `APP_VERSION: version` would result in datapoints having a dimension called `version` whose value is the value of the `APP_VERSION` envvar configured for that particular container, if present. | | `excludedImages` | no | `list of strings` | A list of filters of images to exclude. Supports literals, globs, and regex. | @@ -164,11 +165,11 @@ monitor config option `extraGroups`: - `memory.stats.writeback` (*gauge*)
The amount of memory from file/anon cache that are queued for syncing to the disk - ***`memory.usage.limit`*** (*gauge*)
Memory usage limit of the container, in bytes - `memory.usage.max` (*gauge*)
Maximum measured memory usage of the container, in bytes - - ***`memory.usage.total`*** (*gauge*)
Bytes of memory used by the container. Note that this **includes the - buffer cache** attributed to the process by the kernel from files that - have been read by processes in the container. If you don't want to - count that when monitoring containers, enable the metric - `memory.stats.total_cache` and subtract that metric from this one. + - ***`memory.usage.total`*** (*gauge*)
Bytes of memory used by the container. Note that this **excludes** the + buffer cache accounted to the process by the kernel from files that + have been read by processes in the container, as well as tmpfs usage. + If you want to count that when monitoring containers, enable the metric + `memory.stats.total_cache` and add it to this metric in SignalFlow. #### Group network @@ -186,9 +187,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -197,19 +195,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/dotnet.md b/signalfx-agent/agent_docs/monitors/dotnet.md index c26515fa8..c87b4e176 100644 --- a/signalfx-agent/agent_docs/monitors/dotnet.md +++ b/signalfx-agent/agent_docs/monitors/dotnet.md @@ -4,7 +4,7 @@ # dotnet -Monitor Type: `dotnet` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/dotnet)) +Monitor Type: `dotnet` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/dotnet)) **Accepts Endpoints**: No @@ -70,7 +70,6 @@ This monitor emits all metrics by default; however, **none are categorized as -- they are all custom**. - - ***`net_clr_exceptions.num_exceps_thrown_sec`*** (*gauge*)
The number of exceptions thrown by .NET applications. - ***`net_clr_locksandthreads.contention_rate_sec`*** (*gauge*)
The rate of thread of thread contention per second for .NET applications. - ***`net_clr_locksandthreads.current_queue_length`*** (*gauge*)
The current thread queue length for .NET applications. diff --git a/signalfx-agent/agent_docs/monitors/ecs-metadata.md b/signalfx-agent/agent_docs/monitors/ecs-metadata.md index 3cf07e309..0df94095d 100644 --- a/signalfx-agent/agent_docs/monitors/ecs-metadata.md +++ b/signalfx-agent/agent_docs/monitors/ecs-metadata.md @@ -4,7 +4,7 @@ # ecs-metadata -Monitor Type: `ecs-metadata` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/ecs)) +Monitor Type: `ecs-metadata` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/ecs)) **Accepts Endpoints**: No @@ -109,9 +109,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -120,19 +117,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/elasticsearch-query.md b/signalfx-agent/agent_docs/monitors/elasticsearch-query.md index 800e3bd33..e31aebc28 100644 --- a/signalfx-agent/agent_docs/monitors/elasticsearch-query.md +++ b/signalfx-agent/agent_docs/monitors/elasticsearch-query.md @@ -4,7 +4,7 @@ # elasticsearch-query -Monitor Type: `elasticsearch-query` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/elasticsearch/query)) +Monitor Type: `elasticsearch-query` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/elasticsearch/query)) **Accepts Endpoints**: **Yes** @@ -273,9 +273,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | diff --git a/signalfx-agent/agent_docs/monitors/elasticsearch.md b/signalfx-agent/agent_docs/monitors/elasticsearch.md index 823a27106..4cf14a9d1 100644 --- a/signalfx-agent/agent_docs/monitors/elasticsearch.md +++ b/signalfx-agent/agent_docs/monitors/elasticsearch.md @@ -4,7 +4,7 @@ # elasticsearch -Monitor Type: `elasticsearch` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/elasticsearch/stats)) +Monitor Type: `elasticsearch` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/elasticsearch/stats)) **Accepts Endpoints**: **Yes** @@ -13,9 +13,7 @@ Monitor Type: `elasticsearch` ([Source](https://github.com/signalfx/signalfx-age ## Overview This monitor collects stats from Elasticsearch. It collects node, cluster -and index level stats. This monitor is compatible with the current collectd -plugin found [here] (https://github.com/signalfx/collectd-elasticsearch) in -terms of metric naming. +and index level stats. This monitor collects cluster level and index level stats only from the current master in an Elasticsearch cluster by default. It is possible to override this with the @@ -149,9 +147,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -479,9 +478,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -490,20 +486,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/etcd.md b/signalfx-agent/agent_docs/monitors/etcd.md index 12ae2e3fb..547a9e3cc 100644 --- a/signalfx-agent/agent_docs/monitors/etcd.md +++ b/signalfx-agent/agent_docs/monitors/etcd.md @@ -4,7 +4,7 @@ # etcd -Monitor Type: `etcd` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/etcd)) +Monitor Type: `etcd` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/etcd)) **Accepts Endpoints**: **Yes** @@ -57,9 +57,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -288,9 +289,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -299,19 +297,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/expvar.md b/signalfx-agent/agent_docs/monitors/expvar.md index 1dd46fea7..0cd3bfa92 100644 --- a/signalfx-agent/agent_docs/monitors/expvar.md +++ b/signalfx-agent/agent_docs/monitors/expvar.md @@ -4,7 +4,7 @@ # expvar -Monitor Type: `expvar` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/expvar)) +Monitor Type: `expvar` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/expvar)) **Accepts Endpoints**: **Yes** @@ -267,6 +267,8 @@ Metrics that are categorized as [container/host](https://docs.splunk.com/observability/admin/subscription-usage/monitor-imm-billing-usage.html#about-custom-bundled-and-high-resolution-metrics) (*default*) are ***in bold and italics*** in the list below. +This monitor will also emit by default any metrics that are not listed below. + - `memstats.alloc` (*gauge*)
Bytes of allocated heap objects. Same as memstats.heap_alloc - ***`memstats.buck_hash_sys`*** (*gauge*)
Bytes of memory in profiling bucket hash tables @@ -305,9 +307,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -316,19 +315,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/filesystems.md b/signalfx-agent/agent_docs/monitors/filesystems.md index a9d6dd6f4..21426248d 100644 --- a/signalfx-agent/agent_docs/monitors/filesystems.md +++ b/signalfx-agent/agent_docs/monitors/filesystems.md @@ -4,7 +4,7 @@ # filesystems -Monitor Type: `filesystems` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/filesystems)) +Monitor Type: `filesystems` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/filesystems)) **Accepts Endpoints**: No @@ -25,6 +25,28 @@ monitors: hostFSPath: /hostfs ``` +## Migrating from collectd/df +The `collectd/df` monitor is being deprecated in favor of the `filesystems` +monitor. While the `collectd/df` monitor will still be available in +5.0, it is recommended that you switch to the `filesystems` monitor soon +after upgrading. There are a few incompatibilities to be aware of between +the two monitors: + + - `collectd/df` used a dimension called `plugin_instance` to identify the + mount point or device of the filesystem. This dimension is completely + removed in the `filesystems` monitor and replaced by the `mountpoint` + and `device` dimensions. You no longer have to select between the two + (the `reportByDevice` option on `collectd/df`) as both are always + reported. + + - The mountpoints in the `plugin_instance` dimension of `collectd/df` + were reported with `-` instead of the more conventional `/` separated + path segments. The `filesystems` monitor always reports mountpoints in + the `mountpoint` dimension and uses the conventional `/` separator. + + - The `collectd/df` plugin set a dimension `plugin: df` on all datapoints, + but `filesystems` has no such comparable dimension. + ## Configuration @@ -58,31 +80,30 @@ Metrics that are categorized as - ***`df_complex.free`*** (*gauge*)
Free disk space in bytes + - `df_complex.reserved` (*gauge*)
Measures disk space in bytes reserved for the super-user on this file system. - ***`df_complex.used`*** (*gauge*)
Used disk space in bytes - - ***`disk.summary_utilization`*** (*gauge*)
Percent of disk space utilized on all volumes on this host. This metric reports with plugin dimension set to "signalfx-metadata". - - ***`disk.utilization`*** (*gauge*)
Percent of disk used on this volume. This metric reports with plugin dimension set to "signalfx-metadata". + - ***`disk.summary_utilization`*** (*gauge*)
Percent of disk space utilized on all volumes on this host. + - ***`disk.utilization`*** (*gauge*)
Percent of disk used on this volume. #### Group inodes All of the following metrics are part of the `inodes` metric group. All of the non-default metrics below can be turned on by adding `inodes` to the monitor config option `extraGroups`: - - `df_inodes.free` (*gauge*)
(Linux Only) Number of inodes that are free. This is is only reported if the configuration option `inodes` is set to `true`. - - `df_inodes.used` (*gauge*)
(Linux Only) Number of inodes that are used. This is only reported if the configuration option `inodes` is set to `true`. - - `percent_inodes.free` (*gauge*)
(Linux Only) Free inodes on the file system, expressed as a percentage. This is only reported if the configuration option `inodes` is set to `true`. - - `percent_inodes.used` (*gauge*)
(Linux Only) Used inodes on the file system, expressed as a percentage. This is only reported if the configuration option `inodes` is set to `true`. - -#### Group logical -All of the following metrics are part of the `logical` metric group. All of -the non-default metrics below can be turned on by adding `logical` to the + - `df_inodes.free` (*gauge*)
(Linux Only) Number of inodes that are free. + - `df_inodes.used` (*gauge*)
(Linux Only) Number of inodes that are used. + - `percent_inodes.free` (*gauge*)
(Linux Only) Free inodes on the file system, expressed as a percentage. + - `percent_inodes.used` (*gauge*)
(Linux Only) Used inodes on the file system, expressed as a percentage. + +#### Group percentage +All of the following metrics are part of the `percentage` metric group. All of +the non-default metrics below can be turned on by adding `percentage` to the monitor config option `extraGroups`: - `percent_bytes.free` (*gauge*)
Free disk space on the file system, expressed as a percentage. + - `percent_bytes.reserved` (*gauge*)
Measures disk space reserved for the super-user as a percentage of total disk space of this file system. - `percent_bytes.used` (*gauge*)
Used disk space on the file system, expressed as a percentage. ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -91,19 +112,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/gitlab-gitaly.md b/signalfx-agent/agent_docs/monitors/gitlab-gitaly.md index 15d23d133..311da5c46 100644 --- a/signalfx-agent/agent_docs/monitors/gitlab-gitaly.md +++ b/signalfx-agent/agent_docs/monitors/gitlab-gitaly.md @@ -4,7 +4,7 @@ # gitlab-gitaly -Monitor Type: `gitlab-gitaly` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/gitlab)) +Monitor Type: `gitlab-gitaly` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/gitlab)) **Accepts Endpoints**: **Yes** @@ -35,9 +35,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -72,9 +73,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -83,19 +81,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/gitlab-runner.md b/signalfx-agent/agent_docs/monitors/gitlab-runner.md index a16bce920..71ff53be9 100644 --- a/signalfx-agent/agent_docs/monitors/gitlab-runner.md +++ b/signalfx-agent/agent_docs/monitors/gitlab-runner.md @@ -4,7 +4,7 @@ # gitlab-runner -Monitor Type: `gitlab-runner` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/gitlab)) +Monitor Type: `gitlab-runner` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/gitlab)) **Accepts Endpoints**: **Yes** @@ -21,7 +21,7 @@ monitors: port: 9252 ``` -For more information on configuring monitoring within Gitlab runner itself, see https://docs.gitlab.com/runner/monitoring/README.html. +For more information on configuring monitoring within Gitlab runner itself, see https://docs.gitlab.com/runner/monitoring/index.html. See the [Gitlab monitor](gitlab.md) for more information. @@ -46,9 +46,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -81,9 +82,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -92,19 +90,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/gitlab-sidekiq.md b/signalfx-agent/agent_docs/monitors/gitlab-sidekiq.md index 6efa87c62..9201fd345 100644 --- a/signalfx-agent/agent_docs/monitors/gitlab-sidekiq.md +++ b/signalfx-agent/agent_docs/monitors/gitlab-sidekiq.md @@ -4,7 +4,7 @@ # gitlab-sidekiq -Monitor Type: `gitlab-sidekiq` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/gitlab)) +Monitor Type: `gitlab-sidekiq` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/gitlab)) **Accepts Endpoints**: **Yes** @@ -35,9 +35,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -78,9 +79,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -89,19 +87,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/gitlab-unicorn.md b/signalfx-agent/agent_docs/monitors/gitlab-unicorn.md index 3d79bf0dc..a89f4b0de 100644 --- a/signalfx-agent/agent_docs/monitors/gitlab-unicorn.md +++ b/signalfx-agent/agent_docs/monitors/gitlab-unicorn.md @@ -4,7 +4,7 @@ # gitlab-unicorn -Monitor Type: `gitlab-unicorn` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/gitlab)) +Monitor Type: `gitlab-unicorn` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/gitlab)) **Accepts Endpoints**: **Yes** @@ -15,7 +15,7 @@ Monitor Type: `gitlab-unicorn` ([Source](https://github.com/signalfx/signalfx-ag This is a monitor for GitLab's Unicorn server. The Unicorn server comes with a Prometheus exporter that runs by default on port 8080 at the path `/-/metrics`. The IP address of the SignalFx Smart Agent container or -host, **needs to be whitelisted** as described +host, **needs to be allowed** as described [here](https://docs.gitlab.com/ee/administration/monitoring/ip_whitelist.html) in order for the agent to access the endpoint. @@ -54,9 +54,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -108,9 +109,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -119,19 +117,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/gitlab-workhorse.md b/signalfx-agent/agent_docs/monitors/gitlab-workhorse.md index a34aa97ee..87ec92d5b 100644 --- a/signalfx-agent/agent_docs/monitors/gitlab-workhorse.md +++ b/signalfx-agent/agent_docs/monitors/gitlab-workhorse.md @@ -4,7 +4,7 @@ # gitlab-workhorse -Monitor Type: `gitlab-workhorse` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/gitlab)) +Monitor Type: `gitlab-workhorse` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/gitlab)) **Accepts Endpoints**: **Yes** @@ -49,9 +49,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -99,9 +100,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -110,19 +108,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/gitlab.md b/signalfx-agent/agent_docs/monitors/gitlab.md index f418aca53..0470bd4dc 100644 --- a/signalfx-agent/agent_docs/monitors/gitlab.md +++ b/signalfx-agent/agent_docs/monitors/gitlab.md @@ -4,7 +4,7 @@ # gitlab -Monitor Type: `gitlab` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/gitlab)) +Monitor Type: `gitlab` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/gitlab)) **Accepts Endpoints**: **Yes** @@ -35,7 +35,7 @@ Follow the instructions [here](https://docs.gitlab.com/ee/administration/monitoring/prometheus/index.html) to configure the GitLab's Prometheus exporters to expose metric endpoint targets. For GitLab Runner monitoring configuration go -[here](https://docs.gitlab.com/runner/monitoring/README.html). +[here](https://docs.gitlab.com/runner/monitoring/index.html). Note that configuring GitLab by editing `/etc/gitlab/gitlab.rb` should be accompanied by running the command `gitlab-ctl reconfigure` in order for @@ -173,9 +173,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -249,9 +250,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -260,19 +258,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/haproxy.md b/signalfx-agent/agent_docs/monitors/haproxy.md index 0b3390103..76d582f7f 100644 --- a/signalfx-agent/agent_docs/monitors/haproxy.md +++ b/signalfx-agent/agent_docs/monitors/haproxy.md @@ -4,7 +4,7 @@ # haproxy -Monitor Type: `haproxy` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/haproxy)) +Monitor Type: `haproxy` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/haproxy)) **Accepts Endpoints**: **Yes** @@ -154,6 +154,8 @@ Metrics that are categorized as - `haproxy_server_aborts` (*cumulative*)
Number of data transfers aborted by the server (inc. in eresp). Values reported for backends and servers. - ***`haproxy_server_selected_total`*** (*cumulative*)
Total number of times a server was selected, either for new sessions, or when re-dispatching. The server counter is the number of times that server was selected. Values reported for backends and servers. - ***`haproxy_session_current`*** (*gauge*)
Number current sessions. Values reported for listeners, frontends, backends, and servers. + - `haproxy_session_limit` (*gauge*)
The maximum number of connections allowed, configured with `maxconn`. Values reported for listeners, frontends, backends, and servers. + - `haproxy_session_max` (*gauge*)
The max value of scur. Values reported for listeners, frontends, backends, and servers. - ***`haproxy_session_rate`*** (*gauge*)
Number of sessions per second over last elapsed second. Values reported for frontends, backends, and servers. - ***`haproxy_session_rate_all`*** (*gauge*)
Corresponds to the HAProxy process `SessRate` value given by the `show info` command issued over UNIX socket. - `haproxy_session_rate_limit` (*gauge*)
Configured limit on new sessions per second. Values reported for frontends. @@ -174,9 +176,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -185,20 +184,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/heroku-metadata.md b/signalfx-agent/agent_docs/monitors/heroku-metadata.md index 8a37b17a8..873747b35 100644 --- a/signalfx-agent/agent_docs/monitors/heroku-metadata.md +++ b/signalfx-agent/agent_docs/monitors/heroku-metadata.md @@ -4,7 +4,7 @@ # heroku-metadata -Monitor Type: `heroku-metadata` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/heroku)) +Monitor Type: `heroku-metadata` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/heroku)) **Accepts Endpoints**: **Yes** diff --git a/signalfx-agent/agent_docs/monitors/host-metadata.md b/signalfx-agent/agent_docs/monitors/host-metadata.md index d387155fb..16ec4fd0d 100644 --- a/signalfx-agent/agent_docs/monitors/host-metadata.md +++ b/signalfx-agent/agent_docs/monitors/host-metadata.md @@ -4,7 +4,7 @@ # host-metadata -Monitor Type: `host-metadata` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/metadata/hostmetadata)) +Monitor Type: `host-metadata` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/metadata/hostmetadata)) **Accepts Endpoints**: No @@ -65,9 +65,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -76,20 +73,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/internal-metrics.md b/signalfx-agent/agent_docs/monitors/internal-metrics.md index b5d180009..5ba956bf4 100644 --- a/signalfx-agent/agent_docs/monitors/internal-metrics.md +++ b/signalfx-agent/agent_docs/monitors/internal-metrics.md @@ -4,7 +4,7 @@ # internal-metrics -Monitor Type: `internal-metrics` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/internalmetrics)) +Monitor Type: `internal-metrics` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/internalmetrics)) **Accepts Endpoints**: **Yes** @@ -60,6 +60,9 @@ This monitor emits all metrics by default; however, **none are categorized as - ***`sfxagent.active_monitors`*** (*gauge*)
The total number of monitor instances actively working - ***`sfxagent.active_observers`*** (*gauge*)
The number of observers configured and running - ***`sfxagent.configured_monitors`*** (*gauge*)
The total number of monitor configurations + - ***`sfxagent.correlation_updates_client_errors`*** (*cumulative*)
The number of HTTP status code 4xx responses received while updating trace host correlations + - ***`sfxagent.correlation_updates_invalid`*** (*cumulative*)
The number of trace host correlation updates attempted against invalid dimensions + - ***`sfxagent.correlation_updates_retries`*** (*cumulative*)
The total number of times a trace host correlation requests have been retried - ***`sfxagent.datapoint_channel_len`*** (*gauge*)
The total number of datapoints that have been emitted by monitors but have yet to be accepted by the writer. This number should be 0 most of the time. This will max out at 3000, at which point no datapoints will be generated by monitors. If it does max out, it indicates a bug or extreme CPU starvation of the agent. - ***`sfxagent.datapoint_requests_active`*** (*gauge*)
The total number of outstanding requests to ingest currently active. If this is consistently hovering around the `writer.maxRequests` setting, that setting should probably be increased to give the agent more bandwidth to send datapoints. - ***`sfxagent.datapoints_failed`*** (*cumulative*)
The total number of datapoints that tried to be sent but could not be @@ -90,36 +93,10 @@ This monitor emits all metrics by default; however, **none are categorized as - ***`sfxagent.go_mallocs`*** (*cumulative*)
Total number of heap objects allocated throughout the lifetime of the agent - ***`sfxagent.go_next_gc`*** (*gauge*)
The target heap size -- GC tries to keep the heap smaller than this - ***`sfxagent.go_num_gc`*** (*gauge*)
The number of GC cycles that have happened in the agent since it started + - ***`sfxagent.go_num_goroutine`*** (*gauge*)
Number of goroutines in the agent - ***`sfxagent.go_stack_inuse`*** (*gauge*)
Size in bytes of spans that have at least one goroutine stack in them - ***`sfxagent.go_total_alloc`*** (*cumulative*)
Total number of bytes allocated to the heap throughout the lifetime of the agent - - ***`sfxgent.go_num_goroutine`*** (*gauge*)
Number of goroutines in the agent - -### Non-default metrics (version 4.7.0+) - -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - -To emit metrics that are not _default_, you can add those metrics in the -generic monitor-level `extraMetrics` config option. Metrics that are derived -from specific configuration options that do not appear in the above list of -metrics do not need to be added to `extraMetrics`. - -To see a list of metrics that will be emitted you can run `agent-status -monitors` after configuring this monitor in a running agent instance. - -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - +The agent does not do any built-in filtering of metrics coming out of this +monitor. diff --git a/signalfx-agent/agent_docs/monitors/jaeger-grpc.md b/signalfx-agent/agent_docs/monitors/jaeger-grpc.md index 77b9cfbd4..d8b4f8472 100644 --- a/signalfx-agent/agent_docs/monitors/jaeger-grpc.md +++ b/signalfx-agent/agent_docs/monitors/jaeger-grpc.md @@ -4,7 +4,7 @@ # jaeger-grpc -Monitor Type: `jaeger-grpc` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/jaegergrpc)) +Monitor Type: `jaeger-grpc` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/jaegergrpc)) **Accepts Endpoints**: No diff --git a/signalfx-agent/agent_docs/monitors/java-monitor.md b/signalfx-agent/agent_docs/monitors/java-monitor.md index ad975976f..03cddd2ce 100644 --- a/signalfx-agent/agent_docs/monitors/java-monitor.md +++ b/signalfx-agent/agent_docs/monitors/java-monitor.md @@ -4,7 +4,7 @@ # java-monitor -Monitor Type: `java-monitor` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/subproc/signalfx/java)) +Monitor Type: `java-monitor` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/subproc/signalfx/java)) **Accepts Endpoints**: **Yes** diff --git a/signalfx-agent/agent_docs/monitors/jmx.md b/signalfx-agent/agent_docs/monitors/jmx.md index 9af314845..9af023d18 100644 --- a/signalfx-agent/agent_docs/monitors/jmx.md +++ b/signalfx-agent/agent_docs/monitors/jmx.md @@ -4,7 +4,7 @@ # jmx -Monitor Type: `jmx` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/jmx)) +Monitor Type: `jmx` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/jmx)) **Accepts Endpoints**: **Yes** @@ -145,8 +145,18 @@ Configuration](../monitor-config.html#common-configuration).** | `serviceURL` | no | `string` | The service URL for the JMX RMI/JMXMP endpoint. If empty it will be filled in with values from `host` and `port` using a standard JMX RMI template: `service:jmx:rmi:///jndi/rmi://:/jmxrmi`. If overridden, `host` and `port` will have no effect. For JMXMP endpoint the service URL must be specified. The JMXMP endpoint URL format is `service:jmx:jmxmp://:`. | | `groovyScript` | **yes** | `string` | A literal Groovy script that generates datapoints from JMX MBeans. See the top-level `jmx` monitor doc for more information on how to write this script. You can put the Groovy script in a separate file and refer to it here with the [remote config reference](https://docs.splunk.com/observability/gdi/smart-agent/smart-agent-resources.html#configure-the-smart-agent) `{"#from": "/path/to/file.groovy", raw: true}`, or you can put it straight in YAML by using the `|` heredoc syntax. | | `username` | no | `string` | Username for JMX authentication, if applicable. | -| `password` | no | `string` | Password for JMX autentication, if applicable. | +| `password` | no | `string` | Password for JMX authentication, if applicable. | +| `keyStorePath` | no | `string` | The key store path is required if client authentication is enabled on the target JVM. | +| `keyStorePassword` | no | `string` | The key store file password if required. | +| `keyStoreType` | no | `string` | The key store type. (**default:** `jks`) | +| `trustStorePath` | no | `string` | The trusted store path if the TLS profile is required. | +| `trustStorePassword` | no | `string` | The trust store file password if required. | +| `jmxRemoteProfiles` | no | `string` | Supported JMX remote profiles are TLS in combination with SASL profiles: SASL/PLAIN, SASL/DIGEST-MD5 and SASL/CRAM-MD5. Thus valid `jmxRemoteProfiles` values are: `SASL/PLAIN`, `SASL/DIGEST-MD5`, `SASL/CRAM-MD5`, `TLS SASL/PLAIN`, `TLS SASL/DIGEST-MD5` and `TLS SASL/CRAM-MD5`. | +| `realm` | no | `string` | The realm is required by profile SASL/DIGEST-MD5. | +The agent does not do any built-in filtering of metrics coming out of this +monitor. + diff --git a/signalfx-agent/agent_docs/monitors/kube-controller-manager.md b/signalfx-agent/agent_docs/monitors/kube-controller-manager.md index 843c69330..ddfff8f58 100644 --- a/signalfx-agent/agent_docs/monitors/kube-controller-manager.md +++ b/signalfx-agent/agent_docs/monitors/kube-controller-manager.md @@ -4,7 +4,7 @@ # kube-controller-manager -Monitor Type: `kube-controller-manager` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/kubernetes/controllermanager)) +Monitor Type: `kube-controller-manager` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/kubernetes/controllermanager)) **Accepts Endpoints**: **Yes** @@ -52,9 +52,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -562,9 +563,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -573,19 +571,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/kubelet-stats.md b/signalfx-agent/agent_docs/monitors/kubelet-stats.md index e7cce0c4e..2162395c6 100644 --- a/signalfx-agent/agent_docs/monitors/kubelet-stats.md +++ b/signalfx-agent/agent_docs/monitors/kubelet-stats.md @@ -4,14 +4,19 @@ # kubelet-stats -Monitor Type: `kubelet-stats` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/cadvisor)) +Monitor Type: `kubelet-stats` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/cadvisor)) -**Accepts Endpoints**: No +**Accepts Endpoints**: **Yes** **Multiple Instances Allowed**: Yes ## Overview +**As of Kubernetes 1.18 the `/spec` and `/stats/containers` endpoint that +this monitor uses have been deprecated. Therefore, this monitor is +deprecated in favor of the kubelet-metrics` monitor, which uses the +non-deprecated `/stats/summary` endpoint.** + This monitor pulls cadvisor metrics through a Kubernetes kubelet instance via the `/stats/container` endpoint. @@ -145,9 +150,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -156,20 +158,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/kubernetes-apiserver.md b/signalfx-agent/agent_docs/monitors/kubernetes-apiserver.md index 13e1bb71c..5533d1c33 100644 --- a/signalfx-agent/agent_docs/monitors/kubernetes-apiserver.md +++ b/signalfx-agent/agent_docs/monitors/kubernetes-apiserver.md @@ -4,7 +4,7 @@ # kubernetes-apiserver -Monitor Type: `kubernetes-apiserver` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/kubernetes/apiserver)) +Monitor Type: `kubernetes-apiserver` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/kubernetes/apiserver)) **Accepts Endpoints**: **Yes** @@ -47,9 +47,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -171,7 +172,7 @@ the non-default metrics below can be turned on by adding `apiserver_request` to monitor config option `extraGroups`: - ***`apiserver_request_count`*** (*cumulative*)
(Deprecated) Counter of apiserver requests broken out for each verb, group, version, resource, scope, component, client, and HTTP response contentType and code. - `apiserver_request_duration_seconds` (*cumulative*)
Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component. (sum) - - ***`apiserver_request_duration_seconds_bucket`*** (*cumulative*)
Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component. (bucket) + - `apiserver_request_duration_seconds_bucket` (*cumulative*)
Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component. (bucket) - `apiserver_request_duration_seconds_count` (*cumulative*)
Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component. (count) - `apiserver_request_latencies` (*cumulative*)
(Deprecated) Response latency distribution in microseconds for each verb, group, version, resource, subresource, scope and component. (sum) - `apiserver_request_latencies_bucket` (*cumulative*)
(Deprecated) Response latency distribution in microseconds for each verb, group, version, resource, subresource, scope and component. (bucket) @@ -179,7 +180,7 @@ monitor config option `extraGroups`: - `apiserver_request_latencies_summary` (*cumulative*)
(Deprecated) Response latency summary in microseconds for each verb, group, version, resource, subresource, scope and component. (sum) - `apiserver_request_latencies_summary_count` (*cumulative*)
(Deprecated) Response latency summary in microseconds for each verb, group, version, resource, subresource, scope and component. (count) - `apiserver_request_latencies_summary_quantile` (*gauge*)
(Deprecated) Response latency summary in microseconds for each verb, group, version, resource, subresource, scope and component. (quantized) - - `apiserver_request_total` (*cumulative*)
Counter of apiserver requests broken out for each verb, dry run value, group, version, resource, scope, component, client, and HTTP response contentType and code. + - ***`apiserver_request_total`*** (*cumulative*)
Counter of apiserver requests broken out for each verb, dry run value, group, version, resource, scope, component, client, and HTTP response contentType and code. #### Group apiserver_response All of the following metrics are part of the `apiserver_response` metric group. All of @@ -457,9 +458,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -468,19 +466,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/kubernetes-cluster.md b/signalfx-agent/agent_docs/monitors/kubernetes-cluster.md index 159d7b3b6..ffc8a67b1 100644 --- a/signalfx-agent/agent_docs/monitors/kubernetes-cluster.md +++ b/signalfx-agent/agent_docs/monitors/kubernetes-cluster.md @@ -4,7 +4,7 @@ # kubernetes-cluster -Monitor Type: `kubernetes-cluster` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/kubernetes/cluster)) +Monitor Type: `kubernetes-cluster` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/kubernetes/cluster)) **Accepts Endpoints**: No @@ -55,7 +55,6 @@ Configuration](../monitor-config.html#common-configuration).** | --- | --- | --- | --- | | `alwaysClusterReporter` | no | `bool` | If `true`, leader election is skipped and metrics are always reported. (**default:** `false`) | | `namespace` | no | `string` | If specified, only resources within the given namespace will be monitored. If omitted (blank) all supported resources across all namespaces will be monitored. | -| `useNodeName` | no | `bool` | If set to true, the Kubernetes node name will be used as the dimension to which to sync properties about each respective node. This is necessary if your cluster's machines do not have unique machine-id values, as can happen when machine images are improperly cloned. (**default:** `false`) | | `kubernetesAPI` | no | `object (see below)` | Config for the K8s API client | | `nodeConditionTypesToReport` | no | `list of strings` | A list of node status condition types to report as metrics. The metrics will be reported as datapoints of the form `kubernetes.node_` with a value of `0` corresponding to "False", `1` to "True", and `-1` to "Unknown". (**default:** `[Ready]`) | @@ -78,28 +77,36 @@ Metrics that are categorized as [container/host](https://docs.splunk.com/observability/admin/subscription-usage/monitor-imm-billing-usage.html#about-custom-bundled-and-high-resolution-metrics) (*default*) are ***in bold and italics*** in the list below. +This monitor will also emit by default any metrics that are not listed below. - - `kubernetes.container_cpu_limit` (*gauge*)
Maximum CPU limit set for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. - - `kubernetes.container_cpu_request` (*gauge*)
CPU requested for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. - - `kubernetes.container_ephemeral_storage_limit` (*gauge*)
Maximum ephemeral storage set for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. See https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#local-ephemeral-storage for details. - - `kubernetes.container_ephemeral_storage_request` (*gauge*)
Ephemeral storage requested for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. See https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#local-ephemeral-storage for details - - `kubernetes.container_memory_limit` (*gauge*)
Maximum memory limit set for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. - - `kubernetes.container_memory_request` (*gauge*)
Memory requested for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. + + - ***`kubernetes.container_cpu_limit`*** (*gauge*)
Maximum CPU limit set for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. + - `kubernetes.container_cpu_request` (*gauge*)
CPU requested for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. + - `kubernetes.container_ephemeral_storage_limit` (*gauge*)
Maximum ephemeral storage set for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. See https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#local-ephemeral-storage for details. + - `kubernetes.container_ephemeral_storage_request` (*gauge*)
Ephemeral storage requested for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. See https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#local-ephemeral-storage for details + - ***`kubernetes.container_memory_limit`*** (*gauge*)
Maximum memory limit set for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. + - `kubernetes.container_memory_request` (*gauge*)
Memory requested for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. - ***`kubernetes.container_ready`*** (*gauge*)
Whether a container has passed its readiness probe (0 for no, 1 for yes) - - ***`kubernetes.container_restart_count`*** (*gauge*)
How many times the container has restarted in the recent past. This value is pulled directly from [the K8s API](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#containerstatus-v1-core) and the value can go indefinitely high and be reset to 0 at any time depending on how your [kubelet is configured to prune dead containers](https://kubernetes.io/docs/concepts/cluster-administration/kubelet-garbage-collection/). It is best to not depend too much on the exact value but rather look at it as either `== 0`, in which case you can conclude there were no restarts in the recent past, or `> 0`, in which case you can conclude there were restarts in the recent past, and not try and analyze the value beyond that. + - ***`kubernetes.container_restart_count`*** (*gauge*)
How many times the container has restarted in the recent past. This value is pulled directly from [the K8s API](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#containerstatus-v1-core) and the value can go indefinitely high and be reset to 0 at any time depending on how your [kubelet is configured to prune dead containers](https://kubernetes.io/docs/concepts/cluster-administration/kubelet-garbage-collection/). It is best to not depend too much on the exact value but rather look at it as either `== 0`, in which case you can conclude there were no restarts in the recent past, or `> 0`, in which case you can conclude there were restarts in the recent past, and not try and analyze the value beyond that. - `kubernetes.cronjob.active` (*gauge*)
The number of actively running jobs for a cronjob. - ***`kubernetes.daemon_set.current_scheduled`*** (*gauge*)
The number of nodes that are running at least 1 daemon pod and are supposed to run the daemon pod - ***`kubernetes.daemon_set.desired_scheduled`*** (*gauge*)
The total number of nodes that should be running the daemon pod (including nodes currently running the daemon pod) - ***`kubernetes.daemon_set.misscheduled`*** (*gauge*)
The number of nodes that are running the daemon pod, but are not supposed to run the daemon pod - ***`kubernetes.daemon_set.ready`*** (*gauge*)
The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and ready + - `kubernetes.daemon_set.updated` (*gauge*)
The total number of nodes that are running updated daemon pod - ***`kubernetes.deployment.available`*** (*gauge*)
Total number of available pods (ready for at least minReadySeconds) targeted by this deployment. - ***`kubernetes.deployment.desired`*** (*gauge*)
Number of desired pods in this deployment + - `kubernetes.deployment.updated` (*gauge*)
Total number of non-terminated pods targeted by this deployment that have the desired template spec - `kubernetes.job.active` (*gauge*)
The number of actively running pods for a job. - `kubernetes.job.completions` (*gauge*)
The desired number of successfully finished pods the job should be run with. - - `kubernetes.job.failed` (*counter*)
The number of pods which reased phase Failed for a job. + - `kubernetes.job.failed` (*cumulative*)
The number of pods which reased phase Failed for a job. - `kubernetes.job.parallelism` (*gauge*)
The max desired number of pods the job should run at any given time. - - `kubernetes.job.succeeded` (*counter*)
The number of pods which reached phase Succeeded for a job. + - `kubernetes.job.succeeded` (*cumulative*)
The number of pods which reached phase Succeeded for a job. - ***`kubernetes.namespace_phase`*** (*gauge*)
The current phase of namespaces (`1` for _active_ and `0` for _terminating_) + - `kubernetes.node_allocatable_cpu` (*gauge*)
How many CPU cores remaining that the node can allocate to pods + - `kubernetes.node_allocatable_ephemeral_storage` (*gauge*)
How many bytes of ephemeral storage remaining that the node can allocate to pods + - `kubernetes.node_allocatable_memory` (*gauge*)
How many bytes of RAM memory remaining that the node can allocate to pods + - `kubernetes.node_allocatable_storage` (*gauge*)
How many bytes of storage remaining that the node can allocate to pods - ***`kubernetes.node_ready`*** (*gauge*)
Whether this node is ready (1), not ready (0) or in an unknown state (-1) - ***`kubernetes.pod_phase`*** (*gauge*)
Current phase of the pod (1 - Pending, 2 - Running, 3 - Succeeded, 4 - Failed, 5 - Unknown) - ***`kubernetes.replica_set.available`*** (*gauge*)
Total number of available pods (ready for at least minReadySeconds) targeted by this replica set @@ -133,9 +140,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -144,20 +148,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some @@ -167,7 +157,7 @@ dimensions may be specific to certain metrics. | --- | --- | | `kubernetes_name` | The name of the resource that the metric describes | | `kubernetes_namespace` | The namespace of the resource that the metric describes | -| `kubernetes_node` | The name of the node, as defined by the `name` field of the node resource. | +| `kubernetes_node_uid` | The UID of the node, as defined by the `uid` field of the node resource. | | `kubernetes_pod_uid` | The UID of the pod that the metric describes | | `machine_id` | The machine ID from /etc/machine-id. This should be unique across all nodes in your cluster, but some cluster deployment tools don't guarantee this. This will not be sent if the `useNodeName` config option is set to true. | | `metric_source` | This is always set to `kubernetes` | @@ -182,7 +172,7 @@ are set on the dimension values of the dimension specified. | Name | Dimension | Description | | --- | --- | --- | -| `` | `machine_id/kubernetes_node` | All non-blank labels on a given node will be synced as properties to the `machine_id` or `kubernetes_node` dimension value for that node. Which dimension gets the properties is determined by the `useNodeName` config option. Any blank values will be synced as tags on that same dimension. | +| `` | `kubernetes_node_uid` | All non-blank labels on a given node will be synced as properties to the `kubernetes_node_uid` dimension value for that node. Any blank values will be synced as tags on that same dimension. | | `` | `kubernetes_pod_uid` | Any labels with non-blank values on the pod will be synced as properties to the `kubernetes_pod_uid` dimension. Any blank labels will be synced as tags on that same dimension. | | `container_status` | `container_id` | Status of the container such as `running`, `waiting` or `terminated` are synced to the `container_id` dimension. | | `container_status_reason` | `container_id` | Reason why a container is in a particular state. This property is synced to `container_id` only if the value of `cotnainer_status` is either `waiting` or `terminated`. | @@ -190,6 +180,7 @@ are set on the dimension values of the dimension specified. | `daemonset_creation_timestamp` | `kubernetes_uid` | Timestamp (in RFC3339 format) representing the server time when the daemon set was created and is in UTC. This property is synced onto `kubernetes_uid`. | | `deployment_creation_timestamp` | `kubernetes_uid` | Timestamp (in RFC3339 format) representing the server time when the deployment was created and is in UTC. This property is synced onto `kubernetes_uid`. | | `job_creation_timestamp` | `kubernetes_uid` | Timestamp (in RFC3339 format) representing the server time when the job was created and is in UTC. This property is synced onto `kubernetes_uid`. | +| `node_creation_timestamp` | `kubernetes_node_uid` | CreationTimestamp is a timestamp representing the server time when the node was created and is in UTC. This property is synced onto `kubernetes_node_uid`. | | `pod_creation_timestamp` | `kubernetes_pod_uid` | Timestamp (in RFC3339 format) representing the server time when the pod was created and is in UTC. This property is synced onto `kubernetes_pod_uid`. | | `replicaset_creation_timestamp` | `kubernetes_uid` | Timestamp (in RFC3339 format) representing the server time when the replica set was created and is in UTC. This property is synced onto `kubernetes_uid`. | | `statefulset_creation_timestamp` | `kubernetes_uid` | Timestamp (in RFC3339 format) representing the server time when the stateful set was created and is in UTC. This property is synced onto `kubernetes_uid`. | diff --git a/signalfx-agent/agent_docs/monitors/kubernetes-events.md b/signalfx-agent/agent_docs/monitors/kubernetes-events.md index 82e461784..976e8d790 100644 --- a/signalfx-agent/agent_docs/monitors/kubernetes-events.md +++ b/signalfx-agent/agent_docs/monitors/kubernetes-events.md @@ -4,7 +4,7 @@ # kubernetes-events -Monitor Type: `kubernetes-events` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/kubernetes/events)) +Monitor Type: `kubernetes-events` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/kubernetes/events)) **Accepts Endpoints**: No diff --git a/signalfx-agent/agent_docs/monitors/kubernetes-proxy.md b/signalfx-agent/agent_docs/monitors/kubernetes-proxy.md index 72ade2579..99753e4cd 100644 --- a/signalfx-agent/agent_docs/monitors/kubernetes-proxy.md +++ b/signalfx-agent/agent_docs/monitors/kubernetes-proxy.md @@ -4,7 +4,7 @@ # kubernetes-proxy -Monitor Type: `kubernetes-proxy` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/kubernetes/proxy)) +Monitor Type: `kubernetes-proxy` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/kubernetes/proxy)) **Accepts Endpoints**: **Yes** @@ -50,9 +50,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -144,9 +145,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -155,19 +153,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/kubernetes-scheduler.md b/signalfx-agent/agent_docs/monitors/kubernetes-scheduler.md index 2030281dc..99ee95ec0 100644 --- a/signalfx-agent/agent_docs/monitors/kubernetes-scheduler.md +++ b/signalfx-agent/agent_docs/monitors/kubernetes-scheduler.md @@ -4,7 +4,7 @@ # kubernetes-scheduler -Monitor Type: `kubernetes-scheduler` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/kubernetes/scheduler)) +Monitor Type: `kubernetes-scheduler` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/kubernetes/scheduler)) **Accepts Endpoints**: **Yes** @@ -35,9 +35,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -177,9 +178,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -188,19 +186,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/kubernetes-volumes.md b/signalfx-agent/agent_docs/monitors/kubernetes-volumes.md index ca89ff270..cdb3fdd2d 100644 --- a/signalfx-agent/agent_docs/monitors/kubernetes-volumes.md +++ b/signalfx-agent/agent_docs/monitors/kubernetes-volumes.md @@ -4,7 +4,7 @@ # kubernetes-volumes -Monitor Type: `kubernetes-volumes` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/kubernetes/volumes)) +Monitor Type: `kubernetes-volumes` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/kubernetes/volumes)) **Accepts Endpoints**: No @@ -14,15 +14,30 @@ Monitor Type: `kubernetes-volumes` ([Source](https://github.com/signalfx/signalf This monitor sends usage stats about volumes mounted to Kubernetes pods (e.g. free space/inodes). This information is -gotten from the Kubelet /stats/summary endpoint. The normal `collectd/df` +gotten from the Kubelet /stats/summary endpoint. The normal `filesystems` monitor generally will not report Persistent Volume usage metrics because those volumes are not seen by the agent since they can be mounted dynamically and older versions of K8s don't support mount propagation of those mounts to the agent container. Dimensions that identify the underlying volume source will be added for -`awsElasticBlockStore` and `glusterfs` volumes. Support for more can be -easily added as needed. +`awsElasticBlockStore`, `gcePersistentDisk` and `glusterfs` persistent +volumes, and for `configMap`, `downwardAPI`, `emptyDir` and `secret` +non-persistent volumes. Support for more can be easily added as needed. + +If interested in collecting metrics from Persistent Volumes and Persistent +Volume Claims from a RBAC enabled cluster the following permissions need to +be granted to the Agent. + +```yaml +- apiGroups: + - "" + resources: + - persistentvolumes + - persistentvolumeclaims + verbs: + - get +``` ## Configuration @@ -80,8 +95,20 @@ Metrics that are categorized as - ***`kubernetes.volume_available_bytes`*** (*gauge*)
The number of available bytes in the volume - ***`kubernetes.volume_capacity_bytes`*** (*gauge*)
The total capacity in bytes of the volume -The agent does not do any built-in filtering of metrics coming out of this -monitor. + - `kubernetes.volume_inodes` (*gauge*)
The total inodes in the filesystem + - `kubernetes.volume_inodes_free` (*gauge*)
The free inodes in the filesystem + - `kubernetes.volume_inodes_used` (*gauge*)
The inodes used by the filesystem. This may not equal `inodes - free` because filesystem may share inodes with other filesystems. + +### Non-default metrics (version 4.7.0+) + +To emit metrics that are not _default_, you can add those metrics in the +generic monitor-level `extraMetrics` config option. Metrics that are derived +from specific configuration options that do not appear in the above list of +metrics do not need to be added to `extraMetrics`. + +To see a list of metrics that will be emitted you can run `agent-status +monitors` after configuring this monitor in a running agent instance. + ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some @@ -91,13 +118,15 @@ dimensions may be specific to certain metrics. | --- | --- | | `VolumeId` | (*EBS volumes only*) The EBS volume id of the underlying volume source | | `endpoints_name` | (*GlusterFS volumes only*) The endpoint name used for the GlusterFS volume | +| `fs_type` | (*EBS volumes and GCE persistent disks only*) The filesystem type of the underlying EBS volume or GCE persistent disk | | `glusterfs_path` | (*GlusterFS volumes only*) The GlusterFS volume path | | `kubernetes_namespace` | The namespace of the pod that has this volume | | `kubernetes_pod_name` | The name of the pod that has this volume | | `kubernetes_pod_uid` | The UID of the pod that has this volume | -| `partition` | (*EBS volumes only*) The partition number of the underlying EBS volume (`0` indicates the entire disk) | +| `partition` | (*EBS volumes and GCE persistent disks only*) The partition number of the underlying EBS volume or GCE persistent disk (`0` indicates the entire disk) | +| `pd_name` | (*GCE persistent disks only*) The GCE persistent disk name of the underlying volume source | | `volume` | The volume name as given in the pod spec under `volumes` | -| `volume_type` | The type of the underlying volume -- this will be the key used in the k8s volume config spec (e.g. awsElasticBlockStore, etc.) | +| `volume_type` | The type of the underlying volume -- this will be the key used in the k8s volume config spec (e.g. `awsElasticBlockStore`, `gcePersistentDisk`, `configMap`, `secret`, etc.) | diff --git a/signalfx-agent/agent_docs/monitors/load.md b/signalfx-agent/agent_docs/monitors/load.md index 9c509c43d..c0a8ce758 100644 --- a/signalfx-agent/agent_docs/monitors/load.md +++ b/signalfx-agent/agent_docs/monitors/load.md @@ -4,7 +4,7 @@ # load -Monitor Type: `load` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/load)) +Monitor Type: `load` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/load)) **Accepts Endpoints**: No @@ -56,9 +56,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -67,19 +64,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/logstash-tcp.md b/signalfx-agent/agent_docs/monitors/logstash-tcp.md index a96a7c340..53fe00446 100644 --- a/signalfx-agent/agent_docs/monitors/logstash-tcp.md +++ b/signalfx-agent/agent_docs/monitors/logstash-tcp.md @@ -4,7 +4,7 @@ # logstash-tcp -Monitor Type: `logstash-tcp` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/logstash/tcp)) +Monitor Type: `logstash-tcp` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/logstash/tcp)) **Accepts Endpoints**: **Yes** @@ -126,4 +126,7 @@ Configuration](../monitor-config.html#common-configuration).** +The agent does not do any built-in filtering of metrics coming out of this +monitor. + diff --git a/signalfx-agent/agent_docs/monitors/logstash.md b/signalfx-agent/agent_docs/monitors/logstash.md index 52a4213c4..13da55b37 100644 --- a/signalfx-agent/agent_docs/monitors/logstash.md +++ b/signalfx-agent/agent_docs/monitors/logstash.md @@ -4,7 +4,7 @@ # logstash -Monitor Type: `logstash` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/logstash/logstash)) +Monitor Type: `logstash` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/logstash/logstash)) **Accepts Endpoints**: **Yes** @@ -174,9 +174,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -185,19 +182,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/memory.md b/signalfx-agent/agent_docs/monitors/memory.md index 2c7f5a297..82e0f5ac2 100644 --- a/signalfx-agent/agent_docs/monitors/memory.md +++ b/signalfx-agent/agent_docs/monitors/memory.md @@ -4,7 +4,7 @@ # memory -Monitor Type: `memory` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/memory)) +Monitor Type: `memory` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/memory)) **Accepts Endpoints**: No @@ -55,14 +55,15 @@ Metrics that are categorized as - ***`memory.free`*** (*gauge*)
(Linux Only) Bytes of memory available for use. - ***`memory.slab_recl`*** (*gauge*)
(Linux Only) Bytes of memory, used for SLAB-allocation of kernel objects, that can be reclaimed. - ***`memory.slab_unrecl`*** (*gauge*)
(Linux Only) Bytes of memory, used for SLAB-allocation of kernel objects, that can't be reclaimed. + - `memory.swap_free` (*gauge*)
Bytes of swap memory available for use. + - `memory.swap_total` (*gauge*)
Total bytes of swap memory on the system. + - `memory.swap_used` (*gauge*)
Bytes of swap memory in use by the system. + - ***`memory.total`*** (*gauge*)
Total bytes of system memory on the system. - ***`memory.used`*** (*gauge*)
Bytes of memory in use by the system. - - ***`memory.utilization`*** (*gauge*)
Percent of memory in use on this host. This metric reports with plugin dimension set to "signalfx-metadata". + - ***`memory.utilization`*** (*gauge*)
Percent of memory in use on this host. This does NOT include buffer or cache memory on Linux. ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -71,19 +72,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/net-io.md b/signalfx-agent/agent_docs/monitors/net-io.md index 5267b3199..3a1ecf953 100644 --- a/signalfx-agent/agent_docs/monitors/net-io.md +++ b/signalfx-agent/agent_docs/monitors/net-io.md @@ -4,7 +4,7 @@ # net-io -Monitor Type: `net-io` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/netio)) +Monitor Type: `net-io` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/netio)) **Accepts Endpoints**: No @@ -53,19 +53,18 @@ Metrics that are categorized as (*default*) are ***in bold and italics*** in the list below. + - `if_dropped.rx` (*cumulative*)
Count of received packets dropped by the interface + - `if_dropped.tx` (*cumulative*)
Count of transmitted packets dropped by the interface - ***`if_errors.rx`*** (*cumulative*)
Count of receive errors on the interface - ***`if_errors.tx`*** (*cumulative*)
Count of transmit errors on the interface - ***`if_octets.rx`*** (*cumulative*)
Count of bytes (octets) received on the interface - ***`if_octets.tx`*** (*cumulative*)
Count of bytes (octets) transmitted by the interface - `if_packets.rx` (*cumulative*)
Count of packets received on the interface - `if_packets.tx` (*cumulative*)
Count of packets transmitted by the interface - - ***`network.total`*** (*cumulative*)
Total amount of inbound and outbound network traffic on this host, in bytes. This metric reports with plugin dimension set to "signalfx-metadata". + - ***`network.total`*** (*cumulative*)
Total amount of inbound and outbound network traffic on this host, in bytes. ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -74,19 +73,14 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) +## Dimensions -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** +The following dimensions may occur on metrics emitted by this monitor. Some +dimensions may be specific to certain metrics. -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. +| Name | Description | +| --- | --- | +| `interface` | The name of the network interface (e.g. `eth0`) | diff --git a/signalfx-agent/agent_docs/monitors/openshift-cluster.md b/signalfx-agent/agent_docs/monitors/openshift-cluster.md index a6d740620..1aa3db7c1 100644 --- a/signalfx-agent/agent_docs/monitors/openshift-cluster.md +++ b/signalfx-agent/agent_docs/monitors/openshift-cluster.md @@ -4,7 +4,7 @@ # openshift-cluster -Monitor Type: `openshift-cluster` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/kubernetes/cluster)) +Monitor Type: `openshift-cluster` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/kubernetes/cluster)) **Accepts Endpoints**: No @@ -56,7 +56,6 @@ Configuration](../monitor-config.html#common-configuration).** | --- | --- | --- | --- | | `alwaysClusterReporter` | no | `bool` | If `true`, leader election is skipped and metrics are always reported. (**default:** `false`) | | `namespace` | no | `string` | If specified, only resources within the given namespace will be monitored. If omitted (blank) all supported resources across all namespaces will be monitored. | -| `useNodeName` | no | `bool` | If set to true, the Kubernetes node name will be used as the dimension to which to sync properties about each respective node. This is necessary if your cluster's machines do not have unique machine-id values, as can happen when machine images are improperly cloned. (**default:** `false`) | | `kubernetesAPI` | no | `object (see below)` | Config for the K8s API client | | `nodeConditionTypesToReport` | no | `list of strings` | A list of node status condition types to report as metrics. The metrics will be reported as datapoints of the form `kubernetes.node_` with a value of `0` corresponding to "False", `1` to "True", and `-1` to "Unknown". (**default:** `[Ready]`) | @@ -79,28 +78,36 @@ Metrics that are categorized as [container/host](https://docs.splunk.com/observability/admin/subscription-usage/monitor-imm-billing-usage.html#about-custom-bundled-and-high-resolution-metrics) (*default*) are ***in bold and italics*** in the list below. +This monitor will also emit by default any metrics that are not listed below. - - `kubernetes.container_cpu_limit` (*gauge*)
Maximum CPU limit set for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. - - `kubernetes.container_cpu_request` (*gauge*)
CPU requested for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. - - `kubernetes.container_ephemeral_storage_limit` (*gauge*)
Maximum ephemeral storage set for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. See https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#local-ephemeral-storage for details. - - `kubernetes.container_ephemeral_storage_request` (*gauge*)
Ephemeral storage requested for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. See https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#local-ephemeral-storage for details - - `kubernetes.container_memory_limit` (*gauge*)
Maximum memory limit set for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. - - `kubernetes.container_memory_request` (*gauge*)
Memory requested for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.16/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. + + - ***`kubernetes.container_cpu_limit`*** (*gauge*)
Maximum CPU limit set for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. + - `kubernetes.container_cpu_request` (*gauge*)
CPU requested for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. + - `kubernetes.container_ephemeral_storage_limit` (*gauge*)
Maximum ephemeral storage set for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. See https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#local-ephemeral-storage for details. + - `kubernetes.container_ephemeral_storage_request` (*gauge*)
Ephemeral storage requested for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. See https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#local-ephemeral-storage for details + - ***`kubernetes.container_memory_limit`*** (*gauge*)
Maximum memory limit set for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. + - `kubernetes.container_memory_request` (*gauge*)
Memory requested for the container. This value is derived from https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#resourcerequirements-v1-core which comes from the pod spec and is reported only if a non null value is available. - ***`kubernetes.container_ready`*** (*gauge*)
Whether a container has passed its readiness probe (0 for no, 1 for yes) - - ***`kubernetes.container_restart_count`*** (*gauge*)
How many times the container has restarted in the recent past. This value is pulled directly from [the K8s API](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.11/#containerstatus-v1-core) and the value can go indefinitely high and be reset to 0 at any time depending on how your [kubelet is configured to prune dead containers](https://kubernetes.io/docs/concepts/cluster-administration/kubelet-garbage-collection/). It is best to not depend too much on the exact value but rather look at it as either `== 0`, in which case you can conclude there were no restarts in the recent past, or `> 0`, in which case you can conclude there were restarts in the recent past, and not try and analyze the value beyond that. + - ***`kubernetes.container_restart_count`*** (*gauge*)
How many times the container has restarted in the recent past. This value is pulled directly from [the K8s API](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#containerstatus-v1-core) and the value can go indefinitely high and be reset to 0 at any time depending on how your [kubelet is configured to prune dead containers](https://kubernetes.io/docs/concepts/cluster-administration/kubelet-garbage-collection/). It is best to not depend too much on the exact value but rather look at it as either `== 0`, in which case you can conclude there were no restarts in the recent past, or `> 0`, in which case you can conclude there were restarts in the recent past, and not try and analyze the value beyond that. - `kubernetes.cronjob.active` (*gauge*)
The number of actively running jobs for a cronjob. - ***`kubernetes.daemon_set.current_scheduled`*** (*gauge*)
The number of nodes that are running at least 1 daemon pod and are supposed to run the daemon pod - ***`kubernetes.daemon_set.desired_scheduled`*** (*gauge*)
The total number of nodes that should be running the daemon pod (including nodes currently running the daemon pod) - ***`kubernetes.daemon_set.misscheduled`*** (*gauge*)
The number of nodes that are running the daemon pod, but are not supposed to run the daemon pod - ***`kubernetes.daemon_set.ready`*** (*gauge*)
The number of nodes that should be running the daemon pod and have one or more of the daemon pod running and ready + - `kubernetes.daemon_set.updated` (*gauge*)
The total number of nodes that are running updated daemon pod - ***`kubernetes.deployment.available`*** (*gauge*)
Total number of available pods (ready for at least minReadySeconds) targeted by this deployment. - ***`kubernetes.deployment.desired`*** (*gauge*)
Number of desired pods in this deployment + - `kubernetes.deployment.updated` (*gauge*)
Total number of non-terminated pods targeted by this deployment that have the desired template spec - `kubernetes.job.active` (*gauge*)
The number of actively running pods for a job. - `kubernetes.job.completions` (*gauge*)
The desired number of successfully finished pods the job should be run with. - - `kubernetes.job.failed` (*counter*)
The number of pods which reased phase Failed for a job. + - `kubernetes.job.failed` (*cumulative*)
The number of pods which reased phase Failed for a job. - `kubernetes.job.parallelism` (*gauge*)
The max desired number of pods the job should run at any given time. - - `kubernetes.job.succeeded` (*counter*)
The number of pods which reached phase Succeeded for a job. + - `kubernetes.job.succeeded` (*cumulative*)
The number of pods which reached phase Succeeded for a job. - ***`kubernetes.namespace_phase`*** (*gauge*)
The current phase of namespaces (`1` for _active_ and `0` for _terminating_) + - `kubernetes.node_allocatable_cpu` (*gauge*)
How many CPU cores remaining that the node can allocate to pods + - `kubernetes.node_allocatable_ephemeral_storage` (*gauge*)
How many bytes of ephemeral storage remaining that the node can allocate to pods + - `kubernetes.node_allocatable_memory` (*gauge*)
How many bytes of RAM memory remaining that the node can allocate to pods + - `kubernetes.node_allocatable_storage` (*gauge*)
How many bytes of storage remaining that the node can allocate to pods - ***`kubernetes.node_ready`*** (*gauge*)
Whether this node is ready (1), not ready (0) or in an unknown state (-1) - ***`kubernetes.pod_phase`*** (*gauge*)
Current phase of the pod (1 - Pending, 2 - Running, 3 - Succeeded, 4 - Failed, 5 - Unknown) - ***`kubernetes.replica_set.available`*** (*gauge*)
Total number of available pods (ready for at least minReadySeconds) targeted by this replica set @@ -162,9 +169,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -173,20 +177,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some @@ -196,7 +186,7 @@ dimensions may be specific to certain metrics. | --- | --- | | `kubernetes_name` | The name of the resource that the metric describes | | `kubernetes_namespace` | The namespace of the resource that the metric describes | -| `kubernetes_node` | The name of the node, as defined by the `name` field of the node resource. | +| `kubernetes_node_uid` | The UID of the node, as defined by the `uid` field of the node resource. | | `kubernetes_pod_uid` | The UID of the pod that the metric describes | | `machine_id` | The machine ID from /etc/machine-id. This should be unique across all nodes in your cluster, but some cluster deployment tools don't guarantee this. This will not be sent if the `useNodeName` config option is set to true. | | `metric_source` | This is always set to `openshift` | @@ -211,7 +201,7 @@ are set on the dimension values of the dimension specified. | Name | Dimension | Description | | --- | --- | --- | -| `` | `machine_id/kubernetes_node` | All non-blank labels on a given node will be synced as properties to the `machine_id` or `kubernetes_node` dimension value for that node. Which dimension gets the properties is determined by the `useNodeName` config option. Any blank values will be synced as tags on that same dimension. | +| `` | `kubernetes_node_uid` | All non-blank labels on a given node will be synced as properties to the `kubernetes_node_uid` dimension value for that node. Any blank values will be synced as tags on that same dimension. | | `` | `kubernetes_pod_uid` | Any labels with non-blank values on the pod will be synced as properties to the `kubernetes_pod_uid` dimension. Any blank labels will be synced as tags on that same dimension. | | `container_status` | `container_id` | Status of the container such as `running`, `waiting` or `terminated` are synced to the `container_id` dimension. | | `container_status_reason` | `container_id` | Reason why a container is in a particular state. This property is synced to `container_id` only if the value of `cotnainer_status` is either `waiting` or `terminated`. | @@ -219,6 +209,7 @@ are set on the dimension values of the dimension specified. | `daemonset_creation_timestamp` | `kubernetes_uid` | Timestamp (in RFC3339 format) representing the server time when the daemon set was created and is in UTC. This property is synced onto `kubernetes_uid`. | | `deployment_creation_timestamp` | `kubernetes_uid` | Timestamp (in RFC3339 format) representing the server time when the deployment was created and is in UTC. This property is synced onto `kubernetes_uid`. | | `job_creation_timestamp` | `kubernetes_uid` | Timestamp (in RFC3339 format) representing the server time when the job was created and is in UTC. This property is synced onto `kubernetes_uid`. | +| `node_creation_timestamp` | `kubernetes_node_uid` | CreationTimestamp is a timestamp representing the server time when the node was created and is in UTC. This property is synced onto `kubernetes_node_uid`. | | `pod_creation_timestamp` | `kubernetes_pod_uid` | Timestamp (in RFC3339 format) representing the server time when the pod was created and is in UTC. This property is synced onto `kubernetes_pod_uid`. | | `replicaset_creation_timestamp` | `kubernetes_uid` | Timestamp (in RFC3339 format) representing the server time when the replica set was created and is in UTC. This property is synced onto `kubernetes_uid`. | | `statefulset_creation_timestamp` | `kubernetes_uid` | Timestamp (in RFC3339 format) representing the server time when the stateful set was created and is in UTC. This property is synced onto `kubernetes_uid`. | diff --git a/signalfx-agent/agent_docs/monitors/postgresql.md b/signalfx-agent/agent_docs/monitors/postgresql.md index 4c560d4a7..f8eeef593 100644 --- a/signalfx-agent/agent_docs/monitors/postgresql.md +++ b/signalfx-agent/agent_docs/monitors/postgresql.md @@ -4,7 +4,7 @@ # postgresql -Monitor Type: `postgresql` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/postgresql)) +Monitor Type: `postgresql` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/postgresql)) **Accepts Endpoints**: **Yes** @@ -37,11 +37,21 @@ Tested with PostgreSQL `9.2+`. If you want to collect additional metrics about PostgreSQL, use the [sql monitor](./sql.md). +## Metrics about Replication + +Replication metrics could not be available on some PostgreSQL servers. For now, this monitor +automatically disable `replication` metrics group if it detects Aurora to avoid following error: + +> Function pg_last_xlog_receive_location() is currently not supported for Aurora + +The metric `postgres_replication_state` will only be reported for `master` and +`postgres_replication_lag` only for `standby` role (replica). + ## Example Configuration This example uses the [Vault remote config -source](https://github.com/signalfx/signalfx-agent/blob/master/docs/remote-config.html#nested-values-vault-only) +source](https://github.com/signalfx/signalfx-agent/blob/main/docs/remote-config.html#nested-values-vault-only) to connect to PostgreSQL using the `params` map that allows you to pull out the username and password individually from Vault and interpolate them into the `connectionString` config option. @@ -95,6 +105,7 @@ Configuration](../monitor-config.html#common-configuration).** | `params` | no | `map of strings` | Parameters to the connection string that can be templated into the connection string with the syntax `{{.key}}`. | | `databases` | no | `list of strings` | List of databases to send database-specific metrics about. If omitted, metrics about all databases will be sent. This is an [overridable set](https://docs.splunk.com/observability/gdi/smart-agent/smart-agent-resources.html#filtering-data-using-the-smart-agent). (**default:** `[*]`) | | `databasePollIntervalSeconds` | no | `integer` | How frequently to poll for new/deleted databases in the DB server. Defaults to the same as `intervalSeconds` if not set. (**default:** `0`) | +| `logQueries` | no | `bool` | If true, queries will be logged at the info level. (**default:** `false`) | | `topQueryLimit` | no | `integer` | The number of top queries to consider when publishing query-related metrics (**default:** `10`) | @@ -107,10 +118,13 @@ Metrics that are categorized as - ***`postgres_block_hit_ratio`*** (*gauge*)
The proportion (between 0 and 1, inclusive) of block reads that used the cache and did not have to go to the disk. Is sent for `table`, `index`, and the `database` as a whole. + - `postgres_conflicts` (*cumulative*)
The number of conflicts. - ***`postgres_database_size`*** (*gauge*)
Size in bytes of the database on disk - ***`postgres_deadlocks`*** (*cumulative*)
Total number of deadlocks detected by the system - ***`postgres_index_scans`*** (*cumulative*)
Total number of index scans on the `table`. - ***`postgres_live_rows`*** (*gauge*)
Number of rows live (not deleted) in the `table`. + - `postgres_locks` (*gauge*)
The number of locks active. + - `postgres_pct_connections` (*gauge*)
The number of connections to this database as a fraction of the maximum number of allowed connections. - ***`postgres_query_count`*** (*cumulative*)
Total number of queries executed on the `database`, broken down by `user`. Note that the accuracy of this metric depends on the PostgreSQL [pg_stat_statements.max config option](https://www.postgresql.org/docs/9.3/pgstatstatements.html#AEN160631) being large enough to hold all queries. - ***`postgres_query_time`*** (*cumulative*)
Total time taken to execute queries on the `database`, broken down by `user`. Measured in ms unless otherwise indicated. @@ -118,9 +132,11 @@ Metrics that are categorized as - ***`postgres_rows_inserted`*** (*cumulative*)
Number of rows inserted into the `table`. - ***`postgres_rows_updated`*** (*cumulative*)
Number of rows updated in the `table`. - ***`postgres_sequential_scans`*** (*cumulative*)
Total number of sequential scans on the `table`. - - ***`postgres_sessions`*** (*gauge*)
Number of sessions currently on the server instance. The `state` dimension will specify which which type of session (see `state` row of [pg_stat_activity](https://www.postgresql.org/docs/9.2/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW)). + - ***`postgres_sessions`*** (*gauge*)
Number of sessions currently on the server instance. The `state` dimension will specify which type of session (see `state` row of [pg_stat_activity](https://www.postgresql.org/docs/9.2/monitoring-stats.html#PG-STAT-ACTIVITY-VIEW)). - ***`postgres_table_size`*** (*gauge*)
The size in bytes of the `table` on disk. + - `postgres_xact_commits` (*cumulative*)
The number of transactions that have been committed in this database. + - `postgres_xact_rollbacks` (*cumulative*)
The number of transactions that have been rolled back in this database. #### Group queries All of the following metrics are part of the `queries` metric group. All of @@ -130,10 +146,14 @@ monitor config option `extraGroups`: - `postgres_queries_calls` (*cumulative*)
Top N most frequently executed queries broken down by `database` - `postgres_queries_total_time` (*cumulative*)
Top N queries based on the total execution time broken down by `database` -### Non-default metrics (version 4.7.0+) +#### Group replication +All of the following metrics are part of the `replication` metric group. All of +the non-default metrics below can be turned on by adding `replication` to the +monitor config option `extraGroups`: + - `postgres_replication_lag` (*gauge*)
The current replication delay in seconds. Always = 0 on master. + - `postgres_replication_state` (*gauge*)
The current replication state. -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** +### Non-default metrics (version 4.7.0+) To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived @@ -143,20 +163,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some @@ -166,7 +172,10 @@ dimensions may be specific to certain metrics. | --- | --- | | `database` | The name of the database within a PostgreSQL server to which the metric pertains. | | `index` | For index metrics, the name of the index | +| `replication_role` | For "replication_lag" metric only, could be "master" or "standby". | | `schemaname` | The name of the schema within which the object being monitored resides (e.g. `public`). | +| `slot_name` | For "replication_state" metric only, the name of replication slot. | +| `slot_type` | For "replication_state" metric only, the type of replication. | | `table` | The name of the table to which the metric pertains. | | `tablespace` | For table metrics, the tablespace in which the table belongs, if not null. | | `type` | Whether the object (table, index, function, etc.) belongs to the `system` or `user`. | diff --git a/signalfx-agent/agent_docs/monitors/processlist.md b/signalfx-agent/agent_docs/monitors/processlist.md index 6f0412f6f..a12e5091f 100644 --- a/signalfx-agent/agent_docs/monitors/processlist.md +++ b/signalfx-agent/agent_docs/monitors/processlist.md @@ -4,7 +4,7 @@ # processlist -Monitor Type: `processlist` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/processlist)) +Monitor Type: `processlist` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/processlist)) **Accepts Endpoints**: No diff --git a/signalfx-agent/agent_docs/monitors/prometheus-exporter.md b/signalfx-agent/agent_docs/monitors/prometheus-exporter.md index 175f40359..b982695ee 100644 --- a/signalfx-agent/agent_docs/monitors/prometheus-exporter.md +++ b/signalfx-agent/agent_docs/monitors/prometheus-exporter.md @@ -4,7 +4,7 @@ # prometheus-exporter -Monitor Type: `prometheus-exporter` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/prometheusexporter)) +Monitor Type: `prometheus-exporter` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/prometheusexporter)) **Accepts Endpoints**: **Yes** @@ -16,8 +16,9 @@ This monitor reads metrics from a [Prometheus exporter](https://prometheus.io/docs/instrumenting/exporters/) endpoint. All metric types are supported. See -https://prometheus.io/docs/concepts/metric_types/ for a description of the -Prometheus metric types. The conversion happens as follows: +[Metric Types](https://prometheus.io/docs/concepts/metric_types/) for a +description of the Prometheus metric types. The conversion happens as +follows: - Gauges are converted directly to SignalFx gauges - Counters are converted directly to SignalFx cumulative counters @@ -101,9 +102,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | diff --git a/signalfx-agent/agent_docs/monitors/prometheus-go.md b/signalfx-agent/agent_docs/monitors/prometheus-go.md index d39fe2e2c..4650090aa 100644 --- a/signalfx-agent/agent_docs/monitors/prometheus-go.md +++ b/signalfx-agent/agent_docs/monitors/prometheus-go.md @@ -4,7 +4,7 @@ # prometheus/go -Monitor Type: `prometheus/go` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/prometheus/go)) +Monitor Type: `prometheus/go` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/prometheus/go)) **Accepts Endpoints**: **Yes** @@ -12,7 +12,7 @@ Monitor Type: `prometheus/go` ([Source](https://github.com/signalfx/signalfx-age ## Overview -This monitor scrapes [Prmoetheus Go +This monitor scrapes [Prometheus Go collector](https://godoc.org/github.com/prometheus/client_golang/prometheus#NewGoCollector) and [Prometheus process collector](https://godoc.org/github.com/prometheus/client_golang/prometheus#NewProcessCollector) @@ -41,9 +41,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -62,70 +63,44 @@ This monitor emits all metrics by default; however, **none are categorized as -- they are all custom**. - - `go_gc_duration_seconds` (*cumulative*)
A summary of the GC invocation durations - - `go_gc_duration_seconds_bucket` (*cumulative*)
A summary of the GC invocation durations - - `go_gc_duration_seconds_count` (*cumulative*)
A summary of the GC invocation durations - - `go_goroutines` (*gauge*)
Number of goroutines that currently exist - - `go_info` (*gauge*)
Information about the Go environment - - `go_memstats_alloc_bytes` (*gauge*)
Number of bytes allocated and still in use - - `go_memstats_alloc_bytes_total` (*cumulative*)
Total number of bytes allocated, even if freed - - `go_memstats_buck_hash_sys_bytes` (*gauge*)
Number of bytes used by the profiling bucket hash table - - `go_memstats_frees_total` (*cumulative*)
Total number of frees - - `go_memstats_gc_cpu_fraction` (*gauge*)
The fraction of this program's available CPU time used by the GC since the program started - - `go_memstats_gc_sys_bytes` (*gauge*)
Number of bytes used for garbage collection system metadata - - `go_memstats_heap_alloc_bytes` (*gauge*)
Number of heap bytes allocated and still in use - - `go_memstats_heap_idle_bytes` (*gauge*)
Number of heap bytes waiting to be used - - `go_memstats_heap_inuse_bytes` (*gauge*)
Number of heap bytes that are in use - - `go_memstats_heap_objects` (*gauge*)
Number of allocated objects - - `go_memstats_heap_released_bytes` (*gauge*)
Number of heap bytes released to OS - - `go_memstats_heap_sys_bytes` (*gauge*)
Number of heap bytes obtained from system - - `go_memstats_last_gc_time_seconds` (*gauge*)
Number of seconds since 1970 of last garbage collection - - `go_memstats_lookups_total` (*cumulative*)
Total number of pointer lookups - - `go_memstats_mallocs_total` (*cumulative*)
Total number of mallocs - - `go_memstats_mcache_inuse_bytes` (*gauge*)
Number of bytes in use by mcache structures - - `go_memstats_mcache_sys_bytes` (*gauge*)
Number of bytes used for mcache structures obtained from system - - `go_memstats_mspan_inuse_bytes` (*gauge*)
Number of bytes in use by mspan structures - - `go_memstats_mspan_sys_bytes` (*gauge*)
Number of bytes used for mspan structures obtained from system - - `go_memstats_next_gc_bytes` (*gauge*)
Number of heap bytes when next garbage collection will take place - - `go_memstats_other_sys_bytes` (*gauge*)
Number of bytes used for other system allocations - - `go_memstats_stack_inuse_bytes` (*gauge*)
Number of bytes in use by the stack allocator - - `go_memstats_stack_sys_bytes` (*gauge*)
Number of bytes obtained from system for stack allocator - - `go_memstats_sys_bytes` (*gauge*)
Number of bytes obtained from system - - `go_threads` (*gauge*)
Number of OS threads created - - `process_cpu_seconds_total` (*cumulative*)
Total user and system CPU time spent in seconds - - `process_max_fds` (*gauge*)
Maximum number of open file descriptors - - `process_open_fds` (*gauge*)
Number of open file descriptors - - `process_resident_memory_bytes` (*gauge*)
Resident memory size in bytes + - ***`go_gc_duration_seconds`*** (*cumulative*)
A summary of the GC invocation durations + - ***`go_gc_duration_seconds_bucket`*** (*cumulative*)
A summary of the GC invocation durations + - ***`go_gc_duration_seconds_count`*** (*cumulative*)
A summary of the GC invocation durations + - ***`go_goroutines`*** (*gauge*)
Number of goroutines that currently exist + - ***`go_info`*** (*gauge*)
Information about the Go environment + - ***`go_memstats_alloc_bytes`*** (*gauge*)
Number of bytes allocated and still in use + - ***`go_memstats_alloc_bytes_total`*** (*cumulative*)
Total number of bytes allocated, even if freed + - ***`go_memstats_buck_hash_sys_bytes`*** (*gauge*)
Number of bytes used by the profiling bucket hash table + - ***`go_memstats_frees_total`*** (*cumulative*)
Total number of frees + - ***`go_memstats_gc_cpu_fraction`*** (*gauge*)
The fraction of this program's available CPU time used by the GC since the program started + - ***`go_memstats_gc_sys_bytes`*** (*gauge*)
Number of bytes used for garbage collection system metadata + - ***`go_memstats_heap_alloc_bytes`*** (*gauge*)
Number of heap bytes allocated and still in use + - ***`go_memstats_heap_idle_bytes`*** (*gauge*)
Number of heap bytes waiting to be used + - ***`go_memstats_heap_inuse_bytes`*** (*gauge*)
Number of heap bytes that are in use + - ***`go_memstats_heap_objects`*** (*gauge*)
Number of allocated objects + - ***`go_memstats_heap_released_bytes`*** (*gauge*)
Number of heap bytes released to OS + - ***`go_memstats_heap_sys_bytes`*** (*gauge*)
Number of heap bytes obtained from system + - ***`go_memstats_last_gc_time_seconds`*** (*gauge*)
Number of seconds since 1970 of last garbage collection + - ***`go_memstats_lookups_total`*** (*cumulative*)
Total number of pointer lookups + - ***`go_memstats_mallocs_total`*** (*cumulative*)
Total number of mallocs + - ***`go_memstats_mcache_inuse_bytes`*** (*gauge*)
Number of bytes in use by mcache structures + - ***`go_memstats_mcache_sys_bytes`*** (*gauge*)
Number of bytes used for mcache structures obtained from system + - ***`go_memstats_mspan_inuse_bytes`*** (*gauge*)
Number of bytes in use by mspan structures + - ***`go_memstats_mspan_sys_bytes`*** (*gauge*)
Number of bytes used for mspan structures obtained from system + - ***`go_memstats_next_gc_bytes`*** (*gauge*)
Number of heap bytes when next garbage collection will take place + - ***`go_memstats_other_sys_bytes`*** (*gauge*)
Number of bytes used for other system allocations + - ***`go_memstats_stack_inuse_bytes`*** (*gauge*)
Number of bytes in use by the stack allocator + - ***`go_memstats_stack_sys_bytes`*** (*gauge*)
Number of bytes obtained from system for stack allocator + - ***`go_memstats_sys_bytes`*** (*gauge*)
Number of bytes obtained from system + - ***`go_threads`*** (*gauge*)
Number of OS threads created + - ***`process_cpu_seconds_total`*** (*cumulative*)
Total user and system CPU time spent in seconds + - ***`process_max_fds`*** (*gauge*)
Maximum number of open file descriptors + - ***`process_open_fds`*** (*gauge*)
Number of open file descriptors + - ***`process_resident_memory_bytes`*** (*gauge*)
Resident memory size in bytes - ***`process_start_time_seconds`*** (*gauge*)
Start time of the process since unix epoch in seconds - - `process_virtual_memory_bytes` (*gauge*)
Virtual memory size in bytes - - `process_virtual_memory_max_bytes` (*gauge*)
Maximum amount of virtual memory available in bytes - -### Non-default metrics (version 4.7.0+) - -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - -To emit metrics that are not _default_, you can add those metrics in the -generic monitor-level `extraMetrics` config option. Metrics that are derived -from specific configuration options that do not appear in the above list of -metrics do not need to be added to `extraMetrics`. - -To see a list of metrics that will be emitted you can run `agent-status -monitors` after configuring this monitor in a running agent instance. - -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - + - ***`process_virtual_memory_bytes`*** (*gauge*)
Virtual memory size in bytes + - ***`process_virtual_memory_max_bytes`*** (*gauge*)
Maximum amount of virtual memory available in bytes +The agent does not do any built-in filtering of metrics coming out of this +monitor. diff --git a/signalfx-agent/agent_docs/monitors/prometheus-nginx-vts.md b/signalfx-agent/agent_docs/monitors/prometheus-nginx-vts.md index b9e22c77f..6a8ab12a3 100644 --- a/signalfx-agent/agent_docs/monitors/prometheus-nginx-vts.md +++ b/signalfx-agent/agent_docs/monitors/prometheus-nginx-vts.md @@ -4,7 +4,7 @@ # prometheus/nginx-vts -Monitor Type: `prometheus/nginx-vts` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/prometheus/nginxvts)) +Monitor Type: `prometheus/nginx-vts` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/prometheus/nginxvts)) **Accepts Endpoints**: **Yes** @@ -12,7 +12,7 @@ Monitor Type: `prometheus/nginx-vts` ([Source](https://github.com/signalfx/signa ## Overview -This monitor scrapes [Prmoetheus Nginx VTS +This monitor scrapes [Prometheus Nginx VTS exporter](https://github.com/hnlq715/nginx-vts-exporter) metrics from a Prometheus exporter and sends them to SignalFx. It is a wrapper around the [prometheus-exporter](./prometheus-exporter.md) monitor that provides a @@ -39,9 +39,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -81,9 +82,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -92,19 +90,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/prometheus-node.md b/signalfx-agent/agent_docs/monitors/prometheus-node.md index c85232505..72b391bf5 100644 --- a/signalfx-agent/agent_docs/monitors/prometheus-node.md +++ b/signalfx-agent/agent_docs/monitors/prometheus-node.md @@ -4,7 +4,7 @@ # prometheus/node -Monitor Type: `prometheus/node` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/prometheus/node)) +Monitor Type: `prometheus/node` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/prometheus/node)) **Accepts Endpoints**: **Yes** @@ -12,7 +12,7 @@ Monitor Type: `prometheus/node` ([Source](https://github.com/signalfx/signalfx-a ## Overview -This monitor scrapes [Prmoetheus Node +This monitor scrapes [Prometheus Node Exporter](https://github.com/prometheus/node_exporter) metrics and sends them to SignalFx. It is a wrapper around the [prometheus-exporter](./prometheus-exporter.md) monitor that provides a @@ -39,9 +39,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -60,7 +61,6 @@ This monitor emits all metrics by default; however, **none are categorized as -- they are all custom**. - - ***`node_arp_entries`*** (*gauge*)
ARP entries by device - ***`node_boot_time_seconds`*** (*gauge*)
Node boot time, in unixtime - ***`node_context_switches_total`*** (*cumulative*)
Total number of context switches diff --git a/signalfx-agent/agent_docs/monitors/prometheus-postgres.md b/signalfx-agent/agent_docs/monitors/prometheus-postgres.md index 6bc7a90be..7a31bfcc3 100644 --- a/signalfx-agent/agent_docs/monitors/prometheus-postgres.md +++ b/signalfx-agent/agent_docs/monitors/prometheus-postgres.md @@ -4,7 +4,7 @@ # prometheus/postgres -Monitor Type: `prometheus/postgres` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/prometheus/postgres)) +Monitor Type: `prometheus/postgres` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/prometheus/postgres)) **Accepts Endpoints**: **Yes** @@ -12,7 +12,7 @@ Monitor Type: `prometheus/postgres` ([Source](https://github.com/signalfx/signal ## Overview -This monitor scrapes [Prmoetheus PostgreSQL Server +This monitor scrapes [Prometheus PostgreSQL Server Exporter](https://github.com/wrouesnel/postgres_exporter) metrics and sends them to SignalFx. It is a wrapper around the [prometheus-exporter](./prometheus-exporter.md) monitor that provides a @@ -39,9 +39,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -304,9 +305,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -315,19 +313,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/prometheus-prometheus.md b/signalfx-agent/agent_docs/monitors/prometheus-prometheus.md index 6ab51fc8a..56c45fde9 100644 --- a/signalfx-agent/agent_docs/monitors/prometheus-prometheus.md +++ b/signalfx-agent/agent_docs/monitors/prometheus-prometheus.md @@ -4,7 +4,7 @@ # prometheus/prometheus -Monitor Type: `prometheus/prometheus` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/prometheus/prometheus)) +Monitor Type: `prometheus/prometheus` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/prometheus/prometheus)) **Accepts Endpoints**: **Yes** @@ -12,7 +12,7 @@ Monitor Type: `prometheus/prometheus` ([Source](https://github.com/signalfx/sign ## Overview -This monitor scrapes [Prmoetheus server's own internal +This monitor scrapes [Prometheus server's own internal collector](https://prometheus.io/docs/prometheus/latest/getting_started/#configuring-prometheus-to-monitor-itself) metrics from a Prometheus exporter and sends them to SignalFx. It is a wrapper around the [prometheus-exporter](./prometheus-exporter.md) monitor @@ -39,9 +39,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -60,7 +61,6 @@ This monitor emits all metrics by default; however, **none are categorized as -- they are all custom**. - - ***`net_conntrack_dialer_conn_attempted_total`*** (*cumulative*)
Total number of connections attempted by the given dialer a given name - ***`net_conntrack_dialer_conn_closed_total`*** (*cumulative*)
Total number of connections closed which originated from the dialer of a given name - ***`net_conntrack_dialer_conn_established_total`*** (*cumulative*)
Total number of connections successfully established by the given dialer a given name diff --git a/signalfx-agent/agent_docs/monitors/prometheus-redis.md b/signalfx-agent/agent_docs/monitors/prometheus-redis.md index 3638a5032..770bfafea 100644 --- a/signalfx-agent/agent_docs/monitors/prometheus-redis.md +++ b/signalfx-agent/agent_docs/monitors/prometheus-redis.md @@ -4,7 +4,7 @@ # prometheus/redis -Monitor Type: `prometheus/redis` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/prometheus/redis)) +Monitor Type: `prometheus/redis` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/prometheus/redis)) **Accepts Endpoints**: **Yes** @@ -12,7 +12,7 @@ Monitor Type: `prometheus/redis` ([Source](https://github.com/signalfx/signalfx- ## Overview -This monitor scrapes [Prmoetheus Redis +This monitor scrapes [Prometheus Redis Exporter](https://github.com/oliver006/redis_exporter) metrics and sends them to SignalFx. It is a wrapper around the [prometheus-exporter](./prometheus-exporter.md) monitor that provides a @@ -39,9 +39,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -60,7 +61,6 @@ This monitor emits all metrics by default; however, **none are categorized as -- they are all custom**. - - ***`redis_aof_current_rewrite_duration_sec`*** (*gauge*)
aof_current_rewrite_duration_sec metric - ***`redis_aof_enabled`*** (*gauge*)
aof_enabled metric - ***`redis_aof_last_rewrite_duration_sec`*** (*gauge*)
aof_last_rewrite_duration_sec metric diff --git a/signalfx-agent/agent_docs/monitors/python-monitor.md b/signalfx-agent/agent_docs/monitors/python-monitor.md index faea0c942..0b0f5d43d 100644 --- a/signalfx-agent/agent_docs/monitors/python-monitor.md +++ b/signalfx-agent/agent_docs/monitors/python-monitor.md @@ -4,7 +4,7 @@ # python-monitor -Monitor Type: `python-monitor` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/subproc/signalfx/python)) +Monitor Type: `python-monitor` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/subproc/signalfx/python)) **Accepts Endpoints**: **Yes** @@ -44,14 +44,14 @@ function should accept two parameters: `config` and `output`. The `run` function will be called on a regular interval, specified by the common `intervalSeconds` config option on the monitor config. -Here is [an example of a simple monitor](https://github.com/signalfx/signalfx-agent/tree/master/python/sample/monitor_simple.py). +Here is [an example of a simple monitor](https://github.com/signalfx/signalfx-agent/tree/main/python/sample/monitor_simple.py). ## Complex monitor If you need more power and flexibility in defining your monitor, you can use the complex monitor format. With this, you define a class called `Monitor` in a Python module. Here is [a documented example of a complex -monitor](https://github.com/signalfx/signalfx-agent/tree/master/python/sample/monitor_complex.py). +monitor](https://github.com/signalfx/signalfx-agent/tree/main/python/sample/monitor_complex.py). ## Auto-discovery diff --git a/signalfx-agent/agent_docs/monitors/signalfx-forwarder.md b/signalfx-agent/agent_docs/monitors/signalfx-forwarder.md index 3cc0b9c5e..4d3c16561 100644 --- a/signalfx-agent/agent_docs/monitors/signalfx-forwarder.md +++ b/signalfx-agent/agent_docs/monitors/signalfx-forwarder.md @@ -4,7 +4,7 @@ # signalfx-forwarder -Monitor Type: `signalfx-forwarder` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/forwarder)) +Monitor Type: `signalfx-forwarder` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/forwarder)) **Accepts Endpoints**: No @@ -19,6 +19,9 @@ for datapoints (v2) and spans (v1) that our ingest server supports and at the same path (`/v2/datapoint`, `/v1/trace`). By default, the server listens on localhost port 9080 but can be configured to anything. +The `defaultSpanTagsFromEndpoint` and `extraSpanTagsFromEndpoint` config +options are not compatible with the `signalfx-forwarder` monitor. + ## Configuration diff --git a/signalfx-agent/agent_docs/monitors/sql.md b/signalfx-agent/agent_docs/monitors/sql.md index e81f40f7b..67b57b5e1 100644 --- a/signalfx-agent/agent_docs/monitors/sql.md +++ b/signalfx-agent/agent_docs/monitors/sql.md @@ -4,7 +4,7 @@ # sql -Monitor Type: `sql` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/sql)) +Monitor Type: `sql` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/sql)) **Accepts Endpoints**: **Yes** @@ -53,6 +53,60 @@ is the number of customers that belong to that combination of `country` and `status`. You could also specify multiple `metrics` items to generate more than one metric from a single query. +## Metric Expressions + +**Metric Expressions are a beta feature and may break in subsequent +non-major releases. The example documented will be maintained for backwards +compatibility, however.** + +If you need to do more complex logic than simply mapping columns to metric +values and dimensions, you can use the `datapointExpressions` option to the +individual metric configurations. This allows you to use the +[expr](https://github.com/antonmedv/expr/blob/master/docs/Language-Definition.md) +expression language to derive datapoints from individual rows using more +sophisticated logic. These expressions should evaluate to datapoints +created by the `GAUGE` or `CUMULATIVE` helper functions available in the +expression's context. You can also have the expression evaluate to `nil` +if no datapoint should be generated for a particular row. + +The signature for both the `GAUGE` and `CUMULATIVE` functions is +`(metricName, dimensions, value)`, where `metricName` should be a string +value, `dimensions` should be a map of string keys and values, and `value` +should be any numeric value. + +Each of the columns in the row is mapped to a variable in the context of +the expression with the same name. So if there was a column called `name` +in your SQL query result, there will be a variable called `name` that you +can use in the expression. Note that literal string values used in your +expressions must be surrounded by `"`. + +For example, the MySQL [SHOW SLAVE STATS +query](https://dev.mysql.com/doc/refman/8.0/en/show-slave-status.html) +does not let you pre-process columns using SQL but let us say +you wanted to convert the `Slave_IO_Running` column, which is a +string `Yes`/`No` value, to a gauge datapoint that has a value +of 0 or 1. You can do that with the following configuration: + +```yaml + - type: sql + # Example discovery rule, your environment will probably be different + discoveryRule: container_labels["mysql.slave"] == "true" && port == 3306 + dbDriver: mysql + params: + user: root + password: password + connectionString: '{{.user}}:{{.password}}@tcp({{.host}})/mysql' + queries: + - query: 'SHOW SLAVE STATUS' + datapointExpressions: + - 'GAUGE("mysql.slave_sql_running", {master_uuid: Master_UUID, channel: Channel_name}, Slave_SQL_Running == "Yes" ? 1 : 0)' +``` + +This would generate a single gauge datapoint for each row in the slave +status output, with two dimension, `master_uuid` and `channel` and with a +value of 0 or 1 depending on if the slave's SQL thread is running. + + ## Supported Drivers The `dbDriver` config option must specify the database driver to use. @@ -64,6 +118,7 @@ currently support and documentation on the connection string: - `postgres`: https://godoc.org/github.com/lib/pq#hdr-Connection_String_Parameters - `mysql`: https://github.com/go-sql-driver/mysql#dsn-data-source-name - `sqlserver`: https://github.com/denisenkom/go-mssqldb#connection-parameters-and-dsn + - `snowflake`: https://pkg.go.dev/github.com/snowflakedb/gosnowflake#hdr-Connection_Parameters ## Parameterized Connection String @@ -73,6 +128,30 @@ the `params` config option map. You interpolate variables into it with the Go template syntax `{{.varname}}` (see example config above). +## Snowflake Performance and Usage Metrics + +To configure the agent to collect Snowflake performance and usage metrics: +- Copy pkg/sql/snowflake-metrics.yaml from this repo into the same location as your agent.yaml file (for example, /etc/signalfx). +- Configure the sql monitor as follows: +``` +monitors: + - type: sql + intervalSeconds: 3600 + dbDriver: snowflake + params: + account: "account.region" + database: "SNOWFLAKE" + schema: "ACCOUNT_USAGE" + role: "ACCOUNTADMIN" + user: "user" + password: "password" + connectionString: "{{.user}}:{{.password}}@{{.account}}/{{.database}}/{{.schema}}?role={{.role}}" + queries: + {"#from": "/etc/signalfx/snowflake-metrics.yaml"} +``` + +You can also cut/paste the contents of snowflake-metrics.yaml into agent.yaml under "queries" if needed or preferred. And you can edit snowflake-metrics.yaml to only include metrics you care about. + ## Configuration @@ -94,8 +173,8 @@ Configuration](../monitor-config.html#common-configuration).** | `host` | no | `string` | | | `port` | no | `integer` | (**default:** `0`) | | `params` | no | `map of strings` | Parameters to the connectionString that can be templated into that option using Go template syntax (e.g. `{{.key}}`). | -| `dbDriver` | no | `string` | The database driver to use, valid values are `postgres`, `mysql` and `sqlserver`. | -| `connectionString` | no | `string` | A URL or simple option string used to connect to the database. If using PostgreSQL, [see the list of connection string params](https://godoc.org/github.com/lib/pq#hdr-Connection_String_Parameters). | +| `dbDriver` | no | `string` | The database driver to use, valid values are `postgres`, `mysql`, `sqlserver`, and `snowflake`. | +| `connectionString` | no | `string` | A URL or simple option string used to connect to the database. For example, if using PostgreSQL, [see the list of connection string params](https://godoc.org/github.com/lib/pq#hdr-Connection_String_Parameters). | | `queries` | **yes** | `list of objects (see below)` | A list of queries to make against the database that are used to generate datapoints. | | `logQueries` | no | `bool` | If true, query results will be logged at the info level. (**default:** `false`) | @@ -107,6 +186,7 @@ The **nested** `queries` config object has the following fields: | `query` | **yes** | `string` | A SQL query text that selects one or more rows from a database | | `params` | no | `list of any` | Optional parameters that will replace placeholders in the query string. | | `metrics` | no | `list of objects (see below)` | Metrics that should be generated from the query. | +| `datapointExpressions` | no | `list of strings` | A set of [expr] expressions that will be used to convert each row to a set of metrics. Each of these will be run for each row in the query result set, allowing you to generate multiple datapoints per row. Each expression should evaluate to a single datapoint or nil. | The **nested** `metrics` config object has the following fields: diff --git a/signalfx-agent/agent_docs/monitors/statsd.md b/signalfx-agent/agent_docs/monitors/statsd.md index 7bd92a65b..01d43c93d 100644 --- a/signalfx-agent/agent_docs/monitors/statsd.md +++ b/signalfx-agent/agent_docs/monitors/statsd.md @@ -4,7 +4,7 @@ # statsd -Monitor Type: `statsd` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/statsd)) +Monitor Type: `statsd` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/statsd)) **Accepts Endpoints**: No @@ -13,19 +13,19 @@ Monitor Type: `statsd` ([Source](https://github.com/signalfx/signalfx-agent/tree ## Overview This monitor will receive and aggergate Statsd metrics and convert them to -datapoints. It listens on a configured address and port in order to +data points. It listens on a configured address and port in order to receive the statsd metrics. Note that this monitor does not support statsd extensions such as tags. -The monitor supports the `Counter`, `Timer`, `Gauge` and `Set` types which -are dispatched as the SignalFx types `counter`, `gauge`, `gauge` and +The monitor supports the `Counter`, `Timer`, `Gauge`, and `Set` types, which +are dispatched as the SignalFx types `counter`, `gauge`, `gauge`, and `gauge` respectively. -**Note that datapoints will get a `host` dimension of the current host that +**Note:** Data points get a `host` dimension of the current host that the agent is running on, not the host from which the statsd metric was sent. For this reason, it is recommended to send statsd metrics to a local agent instance. If you don't want the `host` dimension, you can set -`disableHostDimensions: true` on the monitor configuration** +`disableHostDimensions: true` on the monitor configuration. #### Verifying installation @@ -37,11 +37,14 @@ in SignalFx that the metric arrived (assuming the default config). $ echo "statsd.test:1|g" | nc -w 1 -u 127.0.0.1 8125 ``` +For Kubernetes environments, use the `status.hostIP` environment variable to verify the installation. This environment variable +is the IP address of the node where the pod is running. See [Expose Pod Information to Containers Through Files](https://kubernetes.io/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/). + #### Adding dimensions to StatsD metrics The StatsD monitor can parse keywords from a statsd metric name by a set of -converters that was configured by user. +converters previously configured by the user. ``` converters: @@ -49,11 +52,11 @@ converters: ... ``` -This converter will parse `traffic`, `mesh`, `service` and `action` as dimensions +This parses `traffic`, `mesh`, `service`, and `action` as dimensions from a metric name `cluster.cds_egress_ecommerce-demo-mesh_gateway-vn_tcp_8080.update_success`. -If a section has only a pair of brackets without a name, it will not capture a dimension. +If a section has only a pair of brackets without a name, it does not capture a dimension. -When multiple converters were provided, a metric will be converted by the first converter with a +If multiple converters are provided, a metric is converted by the first converter with a matching pattern to the metric name. @@ -67,11 +70,11 @@ converters: metricName: "{traffic}.{action}" ``` -The metrics which match to the given pattern will be reported to SignalFx as `{traffic}.{action}`. -For instance, metric `cluster.cds_egress_ecommerce-demo-mesh_gateway-vn_tcp_8080.update_success` -will be reported as `egress.update_success`. +The metrics that match to the given pattern are reported to SignalFx as `{traffic}.{action}`. +For instance, metric name `cluster.cds_egress_ecommerce-demo-mesh_gateway-vn_tcp_8080.update_success` +is reported as `egress.update_success`. -`metricName` is required for a converter configuration. A converter will be +`metricName` is required for a converter configuration. A converter is disabled if `metricName` is not provided. diff --git a/signalfx-agent/agent_docs/monitors/telegraf-logparser.md b/signalfx-agent/agent_docs/monitors/telegraf-logparser.md index fe9e9d5ab..ddb45dd16 100644 --- a/signalfx-agent/agent_docs/monitors/telegraf-logparser.md +++ b/signalfx-agent/agent_docs/monitors/telegraf-logparser.md @@ -4,7 +4,7 @@ # telegraf/logparser -Monitor Type: `telegraf/logparser` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/telegraf/monitors/telegraflogparser)) +Monitor Type: `telegraf/logparser` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/telegraf/monitors/telegraflogparser)) **Accepts Endpoints**: No diff --git a/signalfx-agent/agent_docs/monitors/telegraf-procstat.md b/signalfx-agent/agent_docs/monitors/telegraf-procstat.md index e97ae4116..3546970b1 100644 --- a/signalfx-agent/agent_docs/monitors/telegraf-procstat.md +++ b/signalfx-agent/agent_docs/monitors/telegraf-procstat.md @@ -4,7 +4,7 @@ # telegraf/procstat -Monitor Type: `telegraf/procstat` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/telegraf/monitors/procstat)) +Monitor Type: `telegraf/procstat` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/telegraf/monitors/procstat)) **Accepts Endpoints**: No @@ -51,12 +51,13 @@ Configuration](../monitor-config.html#common-configuration).** | Config option | Required | Type | Description | | --- | --- | --- | --- | | `exe` | no | `string` | The name of an executable to monitor. (ie: `exe: "signalfx-agent*"`) | -| `pattern` | no | `string` | Pattern to match against. On Windows the pattern should be in the form of a WMI query. (ie: `pattern: "%signalfx-agent%"`) | +| `pattern` | no | `string` | Regular expression pattern to match against. | | `user` | no | `string` | Username to match against | | `pidFile` | no | `string` | Path to Pid file to monitor. (ie: `pidFile: "/var/run/signalfx-agent.pid"`) | | `processName` | no | `string` | Used to override the process name dimension | | `prefix` | no | `string` | Prefix to be added to each dimension | | `pidTag` | no | `bool` | Whether to add PID as a dimension instead of part of the metric name (**default:** `false`) | +| `cmdLineTag` | no | `bool` | When true add the full cmdline as a dimension. (**default:** `false`) | | `cGroup` | no | `string` | The name of the cgroup to monitor. This cgroup name will be appended to the configured `sysPath`. See the agent config schema for more information about the `sysPath` agent configuration. | | `WinService` | no | `string` | The name of a windows service to report procstat information on. | @@ -69,15 +70,15 @@ This monitor emits all metrics by default; however, **none are categorized as -- they are all custom**. - - ***`procstat.cpu_time`*** (*gauge*)
Amount of cpu time consumed by the process. - - ***`procstat.cpu_usage`*** (*gauge*)
CPU used by the process. + - ***`procstat.cpu_usage`*** (*gauge*)
CPU percentage used by the process. - ***`procstat.involuntary_context_switches`*** (*gauge*)
Number of involuntary context switches. - ***`procstat.memory_data`*** (*gauge*)
VMData memory used by the process. - ***`procstat.memory_locked`*** (*gauge*)
VMLocked memory used by the process. - ***`procstat.memory_rss`*** (*gauge*)
VMRSS memory used by the process. - ***`procstat.memory_stack`*** (*gauge*)
VMStack memory used by the process. - ***`procstat.memory_swap`*** (*gauge*)
VMSwap memory used by the process. + - ***`procstat.memory_usage`*** (*gauge*)
Memory percentage used by the process. - ***`procstat.memory_vms`*** (*gauge*)
VMS memory used by the process. - ***`procstat.nice_priority`*** (*gauge*)
Nice priority number of the process. - ***`procstat.num_fds`*** (*gauge*)
Number of file descriptors. This may require the agent to be running as root. diff --git a/signalfx-agent/agent_docs/monitors/telegraf-snmp.md b/signalfx-agent/agent_docs/monitors/telegraf-snmp.md index 07e4e6d7f..25a640ead 100644 --- a/signalfx-agent/agent_docs/monitors/telegraf-snmp.md +++ b/signalfx-agent/agent_docs/monitors/telegraf-snmp.md @@ -4,7 +4,7 @@ # telegraf/snmp -Monitor Type: `telegraf/snmp` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/telegraf/monitors/telegrafsnmp)) +Monitor Type: `telegraf/snmp` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/telegraf/monitors/telegrafsnmp)) **Accepts Endpoints**: **Yes** diff --git a/signalfx-agent/agent_docs/monitors/telegraf-sqlserver.md b/signalfx-agent/agent_docs/monitors/telegraf-sqlserver.md index f554c3b5e..b3868891d 100644 --- a/signalfx-agent/agent_docs/monitors/telegraf-sqlserver.md +++ b/signalfx-agent/agent_docs/monitors/telegraf-sqlserver.md @@ -4,7 +4,7 @@ # telegraf/sqlserver -Monitor Type: `telegraf/sqlserver` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/telegraf/monitors/mssqlserver)) +Monitor Type: `telegraf/sqlserver` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/telegraf/monitors/mssqlserver)) **Accepts Endpoints**: **Yes** @@ -219,9 +219,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -230,19 +227,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/telegraf-statsd.md b/signalfx-agent/agent_docs/monitors/telegraf-statsd.md index 15123be07..5527ee2b1 100644 --- a/signalfx-agent/agent_docs/monitors/telegraf-statsd.md +++ b/signalfx-agent/agent_docs/monitors/telegraf-statsd.md @@ -4,7 +4,7 @@ # telegraf/statsd -Monitor Type: `telegraf/statsd` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/telegraf/monitors/telegrafstatsd)) +Monitor Type: `telegraf/statsd` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/telegraf/monitors/telegrafstatsd)) **Accepts Endpoints**: No diff --git a/signalfx-agent/agent_docs/monitors/telegraf-tail.md b/signalfx-agent/agent_docs/monitors/telegraf-tail.md index 942a05fb7..f75010377 100644 --- a/signalfx-agent/agent_docs/monitors/telegraf-tail.md +++ b/signalfx-agent/agent_docs/monitors/telegraf-tail.md @@ -4,7 +4,7 @@ # telegraf/tail -Monitor Type: `telegraf/tail` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/telegraf/monitors/tail)) +Monitor Type: `telegraf/tail` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/telegraf/monitors/tail)) **Accepts Endpoints**: No diff --git a/signalfx-agent/agent_docs/monitors/telegraf-win_perf_counters.md b/signalfx-agent/agent_docs/monitors/telegraf-win_perf_counters.md index 7f7d809d6..ef1e6f4b1 100644 --- a/signalfx-agent/agent_docs/monitors/telegraf-win_perf_counters.md +++ b/signalfx-agent/agent_docs/monitors/telegraf-win_perf_counters.md @@ -4,7 +4,7 @@ # telegraf/win_perf_counters -Monitor Type: `telegraf/win_perf_counters` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/telegraf/monitors/winperfcounters)) +Monitor Type: `telegraf/win_perf_counters` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/telegraf/monitors/winperfcounters)) **Accepts Endpoints**: No diff --git a/signalfx-agent/agent_docs/monitors/telegraf-win_services.md b/signalfx-agent/agent_docs/monitors/telegraf-win_services.md index 0e99674f0..2c146d81f 100644 --- a/signalfx-agent/agent_docs/monitors/telegraf-win_services.md +++ b/signalfx-agent/agent_docs/monitors/telegraf-win_services.md @@ -4,7 +4,7 @@ # telegraf/win_services -Monitor Type: `telegraf/win_services` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/telegraf/monitors/winservices)) +Monitor Type: `telegraf/win_services` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/telegraf/monitors/winservices)) **Accepts Endpoints**: No @@ -59,7 +59,6 @@ This monitor emits all metrics by default; however, **none are categorized as -- they are all custom**. - - ***`win_services.startup_mode`*** (*gauge*)
The configured start up mode of the window windows service. Possible values are: `0` (Boot Start), `1` (System Start), `2` (Auto Start), `3` (Demand Start), `4` (disabled). - ***`win_services.state`*** (*gauge*)
The state of the windows service. Possible values are: `1` (Stopped), `2` (Start Pending), `3` (Stop Pending), `4` (Running), `5` (Continue Pending), `6` (Pause Pending), and `7` (Paused). The agent does not do any built-in filtering of metrics coming out of this diff --git a/signalfx-agent/agent_docs/monitors/trace-forwarder.md b/signalfx-agent/agent_docs/monitors/trace-forwarder.md index 2e4df8903..f8f5eda3f 100644 --- a/signalfx-agent/agent_docs/monitors/trace-forwarder.md +++ b/signalfx-agent/agent_docs/monitors/trace-forwarder.md @@ -4,7 +4,7 @@ # trace-forwarder -Monitor Type: `trace-forwarder` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/traceforwarder)) +Monitor Type: `trace-forwarder` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/traceforwarder)) **Accepts Endpoints**: No diff --git a/signalfx-agent/agent_docs/monitors/traefik.md b/signalfx-agent/agent_docs/monitors/traefik.md index 1d8ef9a47..461e21e98 100644 --- a/signalfx-agent/agent_docs/monitors/traefik.md +++ b/signalfx-agent/agent_docs/monitors/traefik.md @@ -4,7 +4,7 @@ # traefik -Monitor Type: `traefik` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/traefik)) +Monitor Type: `traefik` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/traefik)) **Accepts Endpoints**: **Yes** @@ -51,7 +51,7 @@ See here for complete Tra SignalFx Smart Agent docs can be found here. Choose deployment specific configuration instruction -here. The +here. The SignalFx Smart Agent must have network access to Traefik. Below is an example configuration that enables the traefik monitor. For the given configuration below, the monitor @@ -87,9 +87,10 @@ Configuration](../monitor-config.html#common-configuration).** | `httpTimeout` | no | `int64` | HTTP timeout duration for both read and writes. This should be a duration string that is accepted by https://golang.org/pkg/time/#ParseDuration (**default:** `10s`) | | `username` | no | `string` | Basic Auth username to use on each request, if any. | | `password` | no | `string` | Basic Auth password to use on each request, if any. | -| `useHTTPS` | no | `bool` | If true, the agent will connect to the exporter using HTTPS instead of plain HTTP. (**default:** `false`) | -| `httpHeaders` | no | `map of strings` | A map of key=message-header and value=header-value. Comma separated multiple values for the same message-header is supported. | +| `useHTTPS` | no | `bool` | If true, the agent will connect to the server using HTTPS instead of plain HTTP. (**default:** `false`) | +| `httpHeaders` | no | `map of strings` | A map of HTTP header names to values. Comma separated multiple values for the same message-header is supported. | | `skipVerify` | no | `bool` | If useHTTPS is true and this option is also true, the exporter's TLS cert will not be verified. (**default:** `false`) | +| `sniServerName` | no | `string` | If useHTTPS is true and skipVerify is true, the sniServerName is used to verify the hostname on the returned certificates. It is also included in the client's handshake to support virtual hosting unless it is an IP address. | | `caCertPath` | no | `string` | Path to the CA cert that has signed the TLS cert, unnecessary if `skipVerify` is set to false. | | `clientCertPath` | no | `string` | Path to the client TLS cert to use for TLS required connections | | `clientKeyPath` | no | `string` | Path to the client TLS key to use for TLS required connections | @@ -141,7 +142,7 @@ Metrics that are categorized as - `process_max_fds` (*gauge*)
Maximum number of open file descriptors. - `process_open_fds` (*gauge*)
Number of open file descriptors. - `process_resident_memory_bytes` (*gauge*)
Resident memory size in bytes. - - ***`process_start_time_seconds`*** (*gauge*)
Start time of the process since unix epoch in seconds. + - `process_start_time_seconds` (*gauge*)
Start time of the process since unix epoch in seconds. - `process_virtual_memory_bytes` (*gauge*)
Virtual memory size in bytes. - ***`traefik_backend_open_connections`*** (*gauge*)
How many open connections exist on a backend, partitioned by method and protocol. - `traefik_backend_request_duration_seconds_bucket` (*cumulative*)
The sum of request durations that are within a configured time interval. The request durations are measured at a backend in seconds. This value is partitioned by status code, protocol, and method. @@ -161,9 +162,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -172,19 +170,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/vmem.md b/signalfx-agent/agent_docs/monitors/vmem.md index b9fc8ac7f..c6226176a 100644 --- a/signalfx-agent/agent_docs/monitors/vmem.md +++ b/signalfx-agent/agent_docs/monitors/vmem.md @@ -4,7 +4,7 @@ # vmem -Monitor Type: `vmem` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/vmem)) +Monitor Type: `vmem` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/vmem)) **Accepts Endpoints**: No @@ -12,8 +12,8 @@ Monitor Type: `vmem` ([Source](https://github.com/signalfx/signalfx-agent/tree/m ## Overview -Collects information about the virtual memory -subsystem of the kernel. +Collects information specific to the virtual memory subsystem of the +kernel. For general memory statistics, see the [memory monitor](./memory.md). On Linux hosts, this monitor relies on the `/proc` filesystem. If the underlying host's `/proc` file system is mounted somewhere other than @@ -70,9 +70,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -81,19 +78,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/vsphere.md b/signalfx-agent/agent_docs/monitors/vsphere.md index 198c26b8c..5b133e02b 100644 --- a/signalfx-agent/agent_docs/monitors/vsphere.md +++ b/signalfx-agent/agent_docs/monitors/vsphere.md @@ -4,7 +4,7 @@ # vsphere -Monitor Type: `vsphere` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/vsphere)) +Monitor Type: `vsphere` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/vsphere)) **Accepts Endpoints**: **Yes** @@ -29,8 +29,8 @@ By default, this refresh takes place every 60 seconds; however, this interval ca `InventoryRefreshInterval`. Compatibility: -This monitor uses VMware's govmomi SDK, which officially supports vCenter 6.0, 6.5 and 6.7. -While this monitor may work with vCenter 5.5 and 5.1, these versions are not officially supported. +This monitor uses VMware's govmomi SDK, which officially supports vCenter 6.5, 6.7, and 7.0. +While this monitor may work with vCenter 5.1, 5.5, and 6.0, these versions are not officially supported. Sample YAML configuration: ```yaml @@ -343,9 +343,6 @@ monitor config option `extraGroups`: ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -354,20 +351,6 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - ## Dimensions The following dimensions may occur on metrics emitted by this monitor. Some diff --git a/signalfx-agent/agent_docs/monitors/windows-iis.md b/signalfx-agent/agent_docs/monitors/windows-iis.md index e25c6baec..d9fcafe5a 100644 --- a/signalfx-agent/agent_docs/monitors/windows-iis.md +++ b/signalfx-agent/agent_docs/monitors/windows-iis.md @@ -4,7 +4,7 @@ # windows-iis -Monitor Type: `windows-iis` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/windowsiis)) +Monitor Type: `windows-iis` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/windowsiis)) **Accepts Endpoints**: No @@ -86,9 +86,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -97,19 +94,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/monitors/windows-legacy.md b/signalfx-agent/agent_docs/monitors/windows-legacy.md index da0f9ba4d..0d1fc4c81 100644 --- a/signalfx-agent/agent_docs/monitors/windows-legacy.md +++ b/signalfx-agent/agent_docs/monitors/windows-legacy.md @@ -4,7 +4,7 @@ # windows-legacy -Monitor Type: `windows-legacy` ([Source](https://github.com/signalfx/signalfx-agent/tree/master/pkg/monitors/windowslegacy)) +Monitor Type: `windows-legacy` ([Source](https://github.com/signalfx/signalfx-agent/tree/main/pkg/monitors/windowslegacy)) **Accepts Endpoints**: No @@ -94,9 +94,6 @@ Metrics that are categorized as ### Non-default metrics (version 4.7.0+) -**The following information applies to the agent version 4.7.0+ that has -`enableBuiltInFiltering: true` set on the top level of the agent config.** - To emit metrics that are not _default_, you can add those metrics in the generic monitor-level `extraMetrics` config option. Metrics that are derived from specific configuration options that do not appear in the above list of @@ -105,19 +102,5 @@ metrics do not need to be added to `extraMetrics`. To see a list of metrics that will be emitted you can run `agent-status monitors` after configuring this monitor in a running agent instance. -### Legacy non-default metrics (version < 4.7.0) - -**The following information only applies to agent version older than 4.7.0. If -you have a newer agent and have set `enableBuiltInFiltering: true` at the top -level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](../legacy-filtering.html#old-style-whitelist-filtering).** - -If you have a reference to the `whitelist.json` in your agent's top-level -`metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](../legacy-filtering.html#inclusion-filtering). Or you can just -copy the whitelist.json, modify it, and reference that in `metricsToExclude`. - diff --git a/signalfx-agent/agent_docs/observers/docker.md b/signalfx-agent/agent_docs/observers/docker.md index 09260c781..f81ed4058 100644 --- a/signalfx-agent/agent_docs/observers/docker.md +++ b/signalfx-agent/agent_docs/observers/docker.md @@ -107,13 +107,14 @@ Would result in the `app` endpoint getting an interval of 1 second and the Observer Type: `docker` -[Observer Source Code](https://github.com/signalfx/signalfx-agent/tree/master/pkg/observers/docker) +[Observer Source Code](https://github.com/signalfx/signalfx-agent/tree/main/pkg/observers/docker) ## Configuration | Config option | Required | Type | Description | | --- | --- | --- | --- | | `dockerURL` | no | `string` | (**default:** `unix:///var/run/docker.sock`) | +| `cacheSyncInterval` | no | `int64` | The time to wait before resyncing the list of containers the monitor maintains through the docker event listener example: cacheSyncInterval: "20m" (**default:** `60m`) | | `labelsToDimensions` | no | `map of strings` | A mapping of container label names to dimension names that will get applied to the metrics of all discovered services. The corresponding label values will become the dimension values for the mapped name. E.g. `io.kubernetes.container.name: container_spec_name` would result in a dimension called `container_spec_name` that has the value of the `io.kubernetes.container.name` container label. | | `useHostnameIfPresent` | no | `bool` | If true, the "Config.Hostname" field (if present) of the docker container will be used as the discovered host that is used to configure monitors. If false or if no hostname is configured, the field `NetworkSettings.IPAddress` is used instead. (**default:** `false`) | | `useHostBindings` | no | `bool` | If true, the observer will configure monitors for matching container endpoints using the host bound ip and port. This is useful if containers exist that are not accessible to an instance of the agent running outside of the docker network stack. (**default:** `false`) | @@ -122,9 +123,9 @@ Observer Type: `docker` -## Endpoint Variables +## Target Variables -The following fields are available on endpoints generated by this observer and +The following fields are available on targets generated by this observer and can be used in discovery rules. | Name | Type | Description | @@ -148,14 +149,14 @@ can be used in discovery rules. | `name` | `string` | A observer assigned name of the endpoint. For example, if using the `k8s-api` observer, `name` will be the port name in the pod spec, if any. | | `orchestrator` | `integer` | | | `port` | `integer` | The TCP/UDP port number of the endpoint | -| `port_labels` | `map of string` | A map of labels on the container port. You can use the `Contains` and `Get` helper functions in discovery rules to make use of this. See [Endpoint Discovery](../auto-discovery.html#additional-functions). | +| `port_labels` | `map of string` | A map of labels on the container port | | `port_type` | `string` | TCP or UDP | | `target` | `string` | The type of the thing that this endpoint directly refers to. If the endpoint has a host and port associated with it (most common), the value will be `hostport`. Other possible values are: `pod`, `container`, `host`. See the docs for the specific observer you are using for more details on what types that observer emits. | ## Dimensions -These dimensions are added to all metrics that are emitted for this service -endpoint. These variables are also available to use as variables in discovery +These dimensions are added to all metrics that are emitted for this discovery +target. These variables are also available to use as variables in discovery rules. | Name | Description | diff --git a/signalfx-agent/agent_docs/observers/ecs.md b/signalfx-agent/agent_docs/observers/ecs.md index 40f76ed9b..1bab2c015 100644 --- a/signalfx-agent/agent_docs/observers/ecs.md +++ b/signalfx-agent/agent_docs/observers/ecs.md @@ -102,7 +102,7 @@ Would result in the `app` endpoint getting an interval of 1 second and the Observer Type: `ecs` -[Observer Source Code](https://github.com/signalfx/signalfx-agent/tree/master/pkg/observers/ecs) +[Observer Source Code](https://github.com/signalfx/signalfx-agent/tree/main/pkg/observers/ecs) ## Configuration @@ -114,9 +114,9 @@ Observer Type: `ecs` -## Endpoint Variables +## Target Variables -The following fields are available on endpoints generated by this observer and +The following fields are available on targets generated by this observer and can be used in discovery rules. | Name | Type | Description | @@ -140,14 +140,14 @@ can be used in discovery rules. | `name` | `string` | A observer assigned name of the endpoint. For example, if using the `k8s-api` observer, `name` will be the port name in the pod spec, if any. | | `orchestrator` | `integer` | | | `port` | `integer` | The TCP/UDP port number of the endpoint | -| `port_labels` | `map of string` | A map of labels on the container port. You can use the `Contains` and `Get` helper functions in discovery rules to make use of this. See [Endpoint Discovery](../auto-discovery.html#additional-functions). | +| `port_labels` | `map of string` | A map of labels on the container port | | `port_type` | `string` | TCP or UDP | | `target` | `string` | The type of the thing that this endpoint directly refers to. If the endpoint has a host and port associated with it (most common), the value will be `hostport`. Other possible values are: `pod`, `container`, `host`. See the docs for the specific observer you are using for more details on what types that observer emits. | ## Dimensions -These dimensions are added to all metrics that are emitted for this service -endpoint. These variables are also available to use as variables in discovery +These dimensions are added to all metrics that are emitted for this discovery +target. These variables are also available to use as variables in discovery rules. | Name | Description | diff --git a/signalfx-agent/agent_docs/observers/host.md b/signalfx-agent/agent_docs/observers/host.md index 4b06874a9..55d4bdf18 100644 --- a/signalfx-agent/agent_docs/observers/host.md +++ b/signalfx-agent/agent_docs/observers/host.md @@ -14,20 +14,21 @@ It will look for all listening sockets on TCP and UDP over IPv4 and IPv6. Observer Type: `host` -[Observer Source Code](https://github.com/signalfx/signalfx-agent/tree/master/pkg/observers/host) +[Observer Source Code](https://github.com/signalfx/signalfx-agent/tree/main/pkg/observers/host) ## Configuration | Config option | Required | Type | Description | | --- | --- | --- | --- | +| `omitPIDDimension` | no | `bool` | If `true`, the `pid` dimension will be omitted from the generated endpoints, which means it will not appear on datapoints emitted by monitors instantiated from discovery rules matching this endpoint. (**default:** `false`) | | `pollIntervalSeconds` | no | `integer` | (**default:** `10`) | -## Endpoint Variables +## Target Variables -The following fields are available on endpoints generated by this observer and +The following fields are available on targets generated by this observer and can be used in discovery rules. | Name | Type | Description | @@ -35,7 +36,7 @@ can be used in discovery rules. | `command` | `string` | The full command used to invoke this process, including the executable itself at the beginning. | | `has_port` | `string` | Set to `true` if the endpoint has a port assigned to it. This will be `false` for endpoints that represent a host/container as a whole. | | `ip_address` | `string` | The IP address of the endpoint if the `host` is in the from of an IPv4 address | -| `is_ipv6` | `string` | Will be `true` if the endpoint is IPv6. | +| `is_ipv6` | `bool` | Will be `true` if the endpoint is IPv6. | | `network_port` | `string` | An alias for `port` | | `discovered_by` | `string` | The observer that discovered this endpoint | | `host` | `string` | The hostname/IP address of the endpoint. If this is an IPv6 address, it will be surrounded by `[` and `]`. | @@ -47,8 +48,8 @@ can be used in discovery rules. ## Dimensions -These dimensions are added to all metrics that are emitted for this service -endpoint. These variables are also available to use as variables in discovery +These dimensions are added to all metrics that are emitted for this discovery +target. These variables are also available to use as variables in discovery rules. | Name | Description | diff --git a/signalfx-agent/agent_docs/observers/k8s-api.md b/signalfx-agent/agent_docs/observers/k8s-api.md index d84a5c3ab..0251a6ad7 100644 --- a/signalfx-agent/agent_docs/observers/k8s-api.md +++ b/signalfx-agent/agent_docs/observers/k8s-api.md @@ -4,14 +4,23 @@ # k8s-api - Discovers services running in a Kubernetes cluster by -querying the Kubernetes API server. This observer is designed to only -discover pod endpoints exposed on the same node that the agent is running, -so that the monitoring of services does not generate cross-node traffic. To -know which node the agent is running on, you should set an environment -variable called `MY_NODE_NAME` using the downward API `spec.nodeName` value -in the pod spec. Our provided K8s DaemonSet resource does this already and -provides an example. + Discovers pod endpoints and nodes running in a Kubernetes +cluster by querying the Kubernetes API server. This observer by default +will only discover pod endpoints exposed on the same node that the agent is +running, so that the monitoring of services does not generate cross-node +traffic. To know which node the agent is running on, you should set an +environment variable called `MY_NODE_NAME` using the downward API +`spec.nodeName` value in the pod spec. Our provided K8s DaemonSet resource +does this already and provides an example. + +This observer also emits high-level `pod` targets that only contain a `host` +field but no `port`. This allows monitoring ports on a pod that are not +explicitly specified in the pod spec, which would result in no normal +`hostport` target being emitted for that particular endpoint. + +If `discoverAllPods` is set to `true`, then the observer will discover pods on all +nodes in the cluster (or namespace if specified). By default, only pods on +the same node as the agent are discovered. Note that this observer discovers exposed ports on pod containers, not K8s Endpoint resources, so don't let the terminology of agent "endpoints" @@ -20,7 +29,7 @@ confuse you. Observer Type: `k8s-api` -[Observer Source Code](https://github.com/signalfx/signalfx-agent/tree/master/pkg/observers/kubernetes) +[Observer Source Code](https://github.com/signalfx/signalfx-agent/tree/main/pkg/observers/kubernetes) ## Configuration @@ -28,7 +37,9 @@ Observer Type: `k8s-api` | --- | --- | --- | --- | | `namespace` | no | `string` | If specified, only pods within the given namespace on the same node as the agent will be discovered. If blank, all pods on the same node as the agent will be discovered. | | `kubernetesAPI` | no | `object (see below)` | Configuration for the K8s API client | -| `additionalPortAnnotations` | no | `list of strings` | A list of annotation names that should be used to infer additional ports to be discovered on a particular pod. The pod's annotation value should be a port number. This is useful for annotations like `prometheus.io/port: 9230`. If you don't already have preexisting annotations like this, we recommend using the [SignalFx-specific annotations](https://docs.signalfx.com/en/latest/kubernetes/k8s-monitors-observers.html#config-via-k8s-annotations). | +| `additionalPortAnnotations` | no | `list of strings` | A list of annotation names that should be used to infer additional ports to be discovered on a particular pod. The pod's annotation value should be a port number. This is useful for annotations like `prometheus.io/port: 9230`. | +| `discoverAllPods` | no | `bool` | If true, this observer will watch all Kubernetes pods and discover endpoints/services from each of them. The default behavior (when `false`) is to only watch the pods on the current node that this agent is running on (it knows the current node via the `MY_NODE_NAME` envvar provided by the downward API). (**default:** `false`) | +| `discoverNodes` | no | `bool` | If `true`, the observer will discover nodes as a special type of endpoint. You can match these endpoints in your discovery rules with the condition `target == "k8s-node"`. (**default:** `false`) | The **nested** `kubernetesAPI` config object has the following fields: @@ -44,9 +55,9 @@ The **nested** `kubernetesAPI` config object has the following fields: -## Endpoint Variables +## Target Variables -The following fields are available on endpoints generated by this observer and +The following fields are available on targets generated by this observer and can be used in discovery rules. | Name | Type | Description | @@ -54,10 +65,14 @@ can be used in discovery rules. | `container_name` | `string` | The first and primary name of the container as it is known to the container runtime (e.g. Docker). | | `has_port` | `string` | Set to `true` if the endpoint has a port assigned to it. This will be `false` for endpoints that represent a host/container as a whole. | | `ip_address` | `string` | The IP address of the endpoint if the `host` is in the from of an IPv4 address | -| `kubernetes_annotations` | `string` | The set of annotations on the discovered pod. | +| `kubernetes_annotations` | `map of strings` | The set of annotations on the discovered pod or node. | | `network_port` | `string` | An alias for `port` | -| `pod_metadata` | `string` | The full pod metadata object, as represented by the Go K8s client library (client-go): https://godoc.org/k8s.io/apimachinery/pkg/apis/meta/v1#ObjectMeta. | -| `pod_spec` | `string` | The full pod spec object, as represented by the Go K8s client library (client-go): https://godoc.org/k8s.io/api/core/v1#PodSpec. | +| `node_addresses` | `map of strings` | A map of the different Node addresses specified in the Node status object. The key of the map is the address type and the value is the address string. The address types are `Hostname`, `ExternalIP`, `InternalIP`, `ExternalDNS`, `InternalDNS`. Most likely not all of these address types will be present for a given Node. | +| `node_metadata` | `node_metadata` | The metadata about the Node, for `k8s-node` targets, with fields in TitleCase. See [ObjectMeta v1 meta reference](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#objectmeta-v1-meta). | +| `node_spec` | `node_spec` | The Node spec object, for `k8s-node` targets. See [the K8s reference on this resource](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#nodespec-v1-core), but keep in the mind that fields will be in TitleCase due to passing through Go. | +| `node_status` | `node_status` | The Node status object, for `k8s-node` targets. See [the K8s reference on Node Status](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.19/#nodestatus-v1-core) but keep in mind that fields will be in TitleCase due to passing through Go. | +| `pod_metadata` | `pod metadata` | The full pod metadata object, as represented by the Go K8s client library (client-go): https://godoc.org/k8s.io/apimachinery/pkg/apis/meta/v1#ObjectMeta. | +| `pod_spec` | `pod spec` | The full pod spec object, as represented by the Go K8s client library (client-go): https://godoc.org/k8s.io/api/core/v1#PodSpec. | | `private_port` | `string` | The port that the service endpoint runs on inside the container | | `public_port` | `string` | The port exposed outside the container | | `alternate_port` | `integer` | Used for services that are accessed through some kind of NAT redirection as Docker does. This could be either the public port or the private one. | @@ -73,14 +88,14 @@ can be used in discovery rules. | `name` | `string` | A observer assigned name of the endpoint. For example, if using the `k8s-api` observer, `name` will be the port name in the pod spec, if any. | | `orchestrator` | `integer` | | | `port` | `integer` | The TCP/UDP port number of the endpoint | -| `port_labels` | `map of string` | A map of labels on the container port. You can use the `Contains` and `Get` helper functions in discovery rules to make use of this. See [Endpoint Discovery](../auto-discovery.html#additional-functions). | +| `port_labels` | `map of string` | A map of labels on the container port | | `port_type` | `string` | TCP or UDP | | `target` | `string` | The type of the thing that this endpoint directly refers to. If the endpoint has a host and port associated with it (most common), the value will be `hostport`. Other possible values are: `pod`, `container`, `host`. See the docs for the specific observer you are using for more details on what types that observer emits. | ## Dimensions -These dimensions are added to all metrics that are emitted for this service -endpoint. These variables are also available to use as variables in discovery +These dimensions are added to all metrics that are emitted for this discovery +target. These variables are also available to use as variables in discovery rules. | Name | Description | @@ -90,6 +105,8 @@ rules. | `container_name` | The primary name of the running container -- Docker containers can have multiple names but this will be the first name, if any. | | `container_spec_name` | The short name of the container in the pod spec, **NOT** the running container's name in the Docker engine | | `kubernetes_namespace` | The namespace that the discovered service endpoint is running in. | +| `kubernetes_node` | For Node (`k8s-node`) targets, the name of the Node | +| `kubernetes_node_uid` | For Node (`k8s-node`) targets, the UID of the Node | | `kubernetes_pod_name` | The name of the running pod that is exposing the discovered endpoint | | `kubernetes_pod_uid` | The UID of the pod that is exposing the discovered endpoint | diff --git a/signalfx-agent/agent_docs/observers/k8s-kubelet.md b/signalfx-agent/agent_docs/observers/k8s-kubelet.md index a9d19624d..5505dc969 100644 --- a/signalfx-agent/agent_docs/observers/k8s-kubelet.md +++ b/signalfx-agent/agent_docs/observers/k8s-kubelet.md @@ -14,7 +14,7 @@ this observer may break more easily in future K8s versions. Observer Type: `k8s-kubelet` -[Observer Source Code](https://github.com/signalfx/signalfx-agent/tree/master/pkg/observers/kubelet) +[Observer Source Code](https://github.com/signalfx/signalfx-agent/tree/main/pkg/observers/kubelet) ## Configuration @@ -39,9 +39,9 @@ The **nested** `kubeletAPI` config object has the following fields: -## Endpoint Variables +## Target Variables -The following fields are available on endpoints generated by this observer and +The following fields are available on targets generated by this observer and can be used in discovery rules. | Name | Type | Description | @@ -65,14 +65,14 @@ can be used in discovery rules. | `name` | `string` | A observer assigned name of the endpoint. For example, if using the `k8s-api` observer, `name` will be the port name in the pod spec, if any. | | `orchestrator` | `integer` | | | `port` | `integer` | The TCP/UDP port number of the endpoint | -| `port_labels` | `map of string` | A map of labels on the container port. You can use the `Contains` and `Get` helper functions in discovery rules to make use of this. See [Endpoint Discovery](../auto-discovery.html#additional-functions). | +| `port_labels` | `map of string` | A map of labels on the container port | | `port_type` | `string` | TCP or UDP | | `target` | `string` | The type of the thing that this endpoint directly refers to. If the endpoint has a host and port associated with it (most common), the value will be `hostport`. Other possible values are: `pod`, `container`, `host`. See the docs for the specific observer you are using for more details on what types that observer emits. | ## Dimensions -These dimensions are added to all metrics that are emitted for this service -endpoint. These variables are also available to use as variables in discovery +These dimensions are added to all metrics that are emitted for this discovery +target. These variables are also available to use as variables in discovery rules. | Name | Description | diff --git a/signalfx-agent/agent_docs/quick-install.md b/signalfx-agent/agent_docs/quick-install.md index ffa7d8c89..9e9c3b73b 100644 --- a/signalfx-agent/agent_docs/quick-install.md +++ b/signalfx-agent/agent_docs/quick-install.md @@ -3,117 +3,203 @@ # Quick Install +SignalFx Smart Agent is deprecated. For details, see the [Deprecation Notice](./smartagent-deprecation-notice.md). -The SignalFx Smart Agent is a metric-based agent written in Go that is used to monitor infrastructure and application services from a variety of environments. +SignalFx Smart Agent Integration installs the Smart Agent application on a single host machine from which you want to collect monitoring data. Smart Agent collects infrastructure monitoring, µAPM, and Kubernetes data. For other installation options, including bulk deployments to production, see [Install and Configure the Smart Agent](https://docs.splunk.com/observability/gdi/smart-agent/smart-agent-resources.html#install-the-smart-agent). ## Installation -### Review pre-installation requirements for the Smart Agent +### Prerequisites -Before you download and install the Smart Agent on a **single** host, review the requirements below. +#### General +- Ensure that you've installed the applications and services you want to monitor on a Linux or Windows host. SignalFx doesn't support installing the Smart Agent on macOS or any other OS besides Linux and Windows. +- Uninstall or disable any previously-installed collector agents from your host, such as `collectd`. +- If you have any questions about compatibility between the Smart Agent and your host machine or its applications and services, contact your Splunk support representative. -(For other installation options, including bulk deployments, see [Advanced Installation Options](./advanced-install-options.md).) - -Please note that the Smart Agent does not support Mac OS. +#### Linux +- Ensure that you have access to `terminal` or a similar command line interface application. +- Ensure that your Linux username has permission to run the following commands: + - `curl` + - `sudo` +- Ensure that your machine is running Linux kernel version 3.2 or higher. -**General requirements** -- You must have access to your command line interface. -- You must uninstall or disable any previously installed collector agent from your host, such as collectd. +#### Windows +- Ensure that you have access to Windows PowerShell 6. +- Ensure that your machine is running Windows 8 or higher. +- Ensure that .Net Framework 3.5 or higher is installed. +- While SignalFx recommends that you use TLS 1.2, if you use TLS 1.0 and want to continue using TLS 1.0, then: + - Ensure that you support the following ciphers: + - TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA (secp256r1) - A + - TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA (secp256r1) - A + - TLS_RSA_WITH_AES_256_CBC_SHA (rsa 2048) - A + - TLS_RSA_WITH_AES_128_CBC_SHA (rsa 2048) - A + - TLS_RSA_WITH_3DES_EDE_CBC_SHA (rsa 2048) - C + - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (secp256r1) - A + - TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256 (secp256r1) - A + - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (secp256r1) - A + - TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384 (secp256r1) - A + - TLS_RSA_WITH_AES_128_GCM_SHA256 (rsa 2048) - A + - TLS_RSA_WITH_AES_256_GCM_SHA384 (rsa 2048) - A + - TLS_RSA_WITH_AES_128_CBC_SHA256 (rsa 2048) - A + - See [Solving the TLS 1.0 Problem, 2nd Edition](https://docs.microsoft.com/en-us/security/engineering/solving-tls1-problem) for more information. -**Linux requirements** -- You must run kernel version 2.6 or higher for your Linux distribution. +### Steps -**Windows requirements** -- You must run .Net Framework 3.5 on Windows 8 or higher. -- You must run Visual C++ Compiler for Python 2.7. +#### Access the SignalFx UI +This content appears in both the documentation site and in the SignalFx UI. -### Step 1. Install the SignalFx Smart Agent on your host +If you are reading this content from the documentation site, please access the SignalFx UI so that you can paste pre-populated commands. -#### Linux +To access this content from the SignalFx UI: +1. In the SignalFx UI, in the top menu, click **Integrations**. +2. Locate and select **SignalFx SmartAgent**. +3. Click **Setup**, and continue reading the instructions. -Note: This content appears on a SignalFx documentation page and on the **Setup** tab of the Smart Agent tile in the SignalFx UI. The following code to install the current version works only if you are viewing these instructions on the **Setup** tab. +#### Install the Smart Agent on Linux -From the **Setup** tab, copy and paste the following code into your command line: +This section lists the steps for installing the Smart Agent on Linux. If you want to install the Smart Agent on Windows, proceed to the next section, **Install SignalFx Smart Agent on Windows**. +Copy and paste the following code into your command line or terminal: ```sh -curl -sSL https://dl.signalfx.com/signalfx-agent.sh > /tmp/signalfx-agent.sh +curl -sSL https://dl.signalfx.com/signalfx-agent.sh > /tmp/signalfx-agent.sh; sudo sh /tmp/signalfx-agent.sh --realm YOUR_SIGNALFX_REALM -- YOUR_SIGNALFX_API_TOKEN ``` +When this command finishes, it displays the following: +``` +The SignalFx Agent has been successfully installed. -#### Windows +Make sure that your system's time is relatively accurate or else datapoints may not be accepted. -Note: This content appears on a SignalFx documentation page and on the **Setup** tab of the Smart Agent tile in the SignalFx UI. The following code to install the current version works only if you are viewing these instructions on the **Setup** tab. +The agent's main configuration file is located at /etc/signalfx/agent.yaml. +``` + +If your installation succeeds, proceed to the section **Verify Your Installation**. Otherwise, see the section **Troubleshoot Your Installation**. -From the **Setup** tab, copy and paste the following code into your command line: +#### Install the Smart Agent on Windows +Copy and paste the following code into your Windows PowerShell terminal: ```sh -& {Set-ExecutionPolicy Bypass -Scope Process -Force; $script = ((New-Object System.Net.WebClient).DownloadString('https://dl.signalfx.com/signalfx-agent.ps1')); $params = @{access_token = "YOUR_SIGNALFX_API_TOKEN"; ingest_url = "https://ingest.YOUR_SIGNALFX_REALM.signalfx.com"; api_url = "https://api.YOUR_SIGNALFX_REALM.signalfx.com"}; Invoke-Command -ScriptBlock ([scriptblock]::Create(". {$script} $(&{$args} @params)"))} +& {Set-ExecutionPolicy Bypass -Scope Process -Force; $script = ((New-Object System.Net.WebClient).DownloadString('https://dl.signalfx.com/signalfx-agent.ps1')); $params = @{access_token = "YOUR_SIGNALFX_API_TOKEN"; ingest_url = "https://ingest.YOUR_SIGNALFX_REALM.signalfx.com"; api_url = "https://api.YOUR_SIGNALFX_REALM.signalfx.com"}; Invoke-Command -ScriptBlock ([scriptblock]::Create(”. {$script} $(&{$args} @params)”))} ``` -The agent will be installed as a Windows service and will log to the Windows Event Log. +The agent files are installed to `\Program Files\SignalFx\SignalFxAgent`, and the default configuration file is installed at `\ProgramData\SignalFxAgent\agent.yaml` if it does not already exist. +The install script starts the agent as a Windows service that writes messages to the Windows Event Log. -### Step 2. Confirm your Installation +If your installation succeeds, proceed to the section **Verify Your Installation**. Otherwise, see the section **Troubleshoot Your Installation**. +### Verify Your Installation -1. To confirm your installation, enter the following command on the Linux or Windows command line: +1. To verify that you've successfully installed the Smart Agent, copy and paste the following command into your terminal. - ```sh - sudo signalfx-agent status - ``` +**For Linux:** - The return should be similar to the following example: +```sh +sudo signalfx-agent status +``` - ```sh - SignalFx Agent version: 4.7.6 - Agent uptime: 8m44s - Observers active: host - Active Monitors: 16 - Configured Monitors: 33 - Discovered Endpoint Count: 6 - Bad Monitor Config: None - Global Dimensions: {host: my-host-1} - Datapoints sent (last minute): 1614 - Events Sent (last minute): 0 - Trace Spans Sent (last minute): 0 - ``` +**For Windows:** -2. To confirm your installation, enter the following command on the Linux or Windows command line: +```sh +& ”\Program Files\SignalFx\SignalFxAgent\bin\signalfx-agent.exe” status +``` - | Command | Description | - |---|---| - | signalfx-agent status config | This command shows resolved config in use by the Smart Agent. | - | signalfx-agent status endpoints | This command shows discovered endpoints. | - | signalfx-agent status monitors | This command shows active monitors. | - | signalfx-agent status all | This command shows all of the above statuses. | +The command displays output that is similar to the following: +```sh +SignalFx Agent version: 5.1.0 +Agent uptime: 8m44s +Observers active: host +Active Monitors: 16 +Configured Monitors: 33 +Discovered Endpoint Count: 6 +Bad Monitor Config: None +Global Dimensions: {host: my-host-1} +Datapoints sent (last minute): 1614 +Events Sent (last minute): 0 +Trace Spans Sent (last minute): 0 +``` + +2. To perform additional verification, you can run any of the following commands: -### Troubleshoot the Smart Agent installation +- Display the current Smart Agent configuration. -If you are unable to install the Smart Agent, consider reviewing your error logs: +```sh +sudo signalfx-agent status config +``` -For Linux, use the following command to view error logs via Journal: +- Show endpoints discovered by the Smart Agent. ```sh -journalctl -u signalfx-agent | tail -100 +sudo signalfx-agent status endpoints ``` -For Windows, review the event logs. +- Show the Smart Agent's active monitors. These plugins poll apps and services to retrieve data. + +```sh +sudo signalfx-agent status monitors +``` + +### Troubleshoot Smart Agent Installation +If the Smart Agent installation fails, use the following procedures to gather troubleshooting information. + +#### General troubleshooting +To learn how to review signalfx-agent logs, see [Frequently Asked Questions](./faq.md). + +#### Linux troubleshooting -For additional installation troubleshooting information, including how to review logs, see [Frequently Asked Questions](./faq.md). +To view recent error logs, run the following command in terminal or a similar application: -### Review additional documentation +- For sysv/upstart hosts, run: -After a successful installation, learn more about the SignalFx agent and the SignalFx UI. +```sh +tail -f /var/log/signalfx-agent.log +``` + +- For systemd hosts, run: + +```sh +sudo journalctl -u signalfx-agent -f +``` + +#### Windows troubleshooting +Open **Administrative Tools > Event Viewer > Windows Logs > Application** to view the `signalfx-agent` error logs. + +### Uninstall the Smart Agent + +#### Debian + +To uninstall the Smart Agent on Debian-based distributions, run the following +command: + +```sh +sudo dpkg --remove signalfx-agent +``` + +**Note:** Configuration files may persist in `/etc/signalfx`. + +#### RPM + +To uninstall the Smart Agent on RPM-based distributions, run the following +command: + +```sh +sudo rpm -e signalfx-agent +``` + +**Note:** Configuration files may persist in `/etc/signalfx`. + +#### Windows -* Review the capabilities of the SignalFx Smart Agent. See [Advanced Installation Options](./advanced-install-options.md). +The Smart Agent can be uninstalled from `Programs and Features` in the Windows +Control Panel. -* Learn how data is displayed in the SignalFx UI. See [View infrastructure status](https://docs.signalfx.com/en/latest/getting-started/quick-start.html#step-3-view-infrastructure-status). +**Note:** Configuration files may persist in `\ProgramData\SignalFxAgent`. diff --git a/signalfx-agent/agent_docs/remote-config.md b/signalfx-agent/agent_docs/remote-config.md index 273d74c44..c993404e4 100644 --- a/signalfx-agent/agent_docs/remote-config.md +++ b/signalfx-agent/agent_docs/remote-config.md @@ -119,9 +119,8 @@ that may be empty, such as the following: signalFxAccessToken: abcd monitors: - {"#from": "/etc/signalfx/conf2/*.yaml", flatten: true, optional: true} - - type: collectd/cpu - - type: collectd/cpufreq - - type: collectd/df + - type: cpu + - type: filesystems ``` The key here is the `optional: true` value, which makes it accept globs that diff --git a/signalfx-agent/agent_docs/windows.md b/signalfx-agent/agent_docs/windows.md index d83cc2616..22870dd3b 100644 --- a/signalfx-agent/agent_docs/windows.md +++ b/signalfx-agent/agent_docs/windows.md @@ -2,7 +2,7 @@ # Windows Setup -The agent supports standalone installations on Windows Server 2008 and above. +The agent supports standalone installations on Windows Server 2012 and above. ## Installation @@ -18,10 +18,8 @@ The following monitors are available on Windows. - [aspdotnet](./monitors/aspdotnet.md) - [collectd/consul](./monitors/collectd-consul.md) - [collectd/couchbase](./monitors/collectd-couchbase.md) -- [collectd/elasticsearch](./monitors/collectd-elasticsearch.md) - [collectd/etcd](./monitors/collectd-etcd.md) - [collectd/hadoop](./monitors/collectd-hadoop.md) -- [collectd/haproxy](./monitors/collectd-haproxy.md) - [collectd/health-checker](./monitors/collectd-health-checker.md) - [collectd/jenkins](./monitors/collectd-jenkins.md) - [collectd/kong](./monitors/collectd-kong.md) @@ -38,7 +36,9 @@ The following monitors are available on Windows. - [disk-io](./monitors/disk-io.md) - [docker-container-stats](./monitors/docker-container-stats.md) - [dotnet](./monitors/dotnet.md) +- [elasticsearch](./monitors/elasticsearch.md) - [filesystems](./monitors/filesystems.md) +- [haproxy](./monitors/haproxy.md) - [host-metadata](./monitors/host-metadata.md) - [internal-metrics](./monitors/internal-metrics.md) - [memory](./monitors/memory.md) @@ -85,4 +85,4 @@ with a few command line flags in powershell. - Uninstall Service - PS> SignalFx\SignalFxAgent\bin\signalfx-agent.exe -service "uninstall" \ No newline at end of file + PS> SignalFx\SignalFxAgent\bin\signalfx-agent.exe -service "uninstall" diff --git a/signalfx-agent/metrics.yaml b/signalfx-agent/metrics.yaml index a96ed3b3d..36f0ec500 100644 --- a/signalfx-agent/metrics.yaml +++ b/signalfx-agent/metrics.yaml @@ -2,7 +2,7 @@ sfxagent.active_monitors: brief: The total number of monitor instances actively working - custom: false + custom: true description: The total number of monitor instances actively working metric_type: gauge monitor: internal-metrics @@ -10,7 +10,7 @@ sfxagent.active_monitors: sfxagent.active_observers: brief: The number of observers configured and running - custom: false + custom: true description: The number of observers configured and running metric_type: gauge monitor: internal-metrics @@ -18,16 +18,44 @@ sfxagent.active_observers: sfxagent.configured_monitors: brief: The total number of monitor configurations - custom: false + custom: true description: The total number of monitor configurations metric_type: gauge monitor: internal-metrics title: sfxagent.configured_monitors +sfxagent.correlation_updates_client_errors: + brief: The number of HTTP status code 4xx responses received while updating trace + host correlations + custom: true + description: The number of HTTP status code 4xx responses received while updating + trace host correlations + metric_type: cumulative + monitor: internal-metrics + title: sfxagent.correlation_updates_client_errors + +sfxagent.correlation_updates_invalid: + brief: The number of trace host correlation updates attempted against invalid dimensions + custom: true + description: The number of trace host correlation updates attempted against invalid + dimensions + metric_type: cumulative + monitor: internal-metrics + title: sfxagent.correlation_updates_invalid + +sfxagent.correlation_updates_retries: + brief: The total number of times a trace host correlation requests have been retried + custom: true + description: The total number of times a trace host correlation requests have been + retried + metric_type: cumulative + monitor: internal-metrics + title: sfxagent.correlation_updates_retries + sfxagent.datapoint_channel_len: brief: The total number of datapoints that have been emitted by monitors but have yet to be accepted by the writer - custom: false + custom: true description: The total number of datapoints that have been emitted by monitors but have yet to be accepted by the writer. This number should be 0 most of the time. This will max out at 3000, at which point no datapoints will be generated by monitors. If @@ -38,7 +66,7 @@ sfxagent.datapoint_channel_len: sfxagent.datapoint_requests_active: brief: The total number of outstanding requests to ingest currently active - custom: false + custom: true description: The total number of outstanding requests to ingest currently active. If this is consistently hovering around the `writer.maxRequests` setting, that setting should probably be increased to give the agent more bandwidth to send datapoints. @@ -50,7 +78,7 @@ sfxagent.datapoints_failed: brief: 'The total number of datapoints that tried to be sent but could not be by the agent writer since it last started' - custom: false + custom: true description: 'The total number of datapoints that tried to be sent but could not be @@ -63,7 +91,7 @@ sfxagent.datapoints_failed: sfxagent.datapoints_filtered: brief: The total number of datapoints that were filtered out in the writer - custom: false + custom: true description: The total number of datapoints that were filtered out in the writer. This does not include datapoints filtered by monitor-specific filters. metric_type: cumulative @@ -74,7 +102,7 @@ sfxagent.datapoints_in_flight: brief: The total number of datapoints that have been sent out in a request to ingest but have yet to receive confirmation from ingest that they have been received (i.e - custom: false + custom: true description: The total number of datapoints that have been sent out in a request to ingest but have yet to receive confirmation from ingest that they have been received (i.e. the HTTP response hasn't been gotten). @@ -85,7 +113,7 @@ sfxagent.datapoints_in_flight: sfxagent.datapoints_received: brief: The total number of non-filtered datapoints received by the agent writer since it last started - custom: false + custom: true description: The total number of non-filtered datapoints received by the agent writer since it last started. This number should generally equal `sfxagent.datapoints_sent + sfxagent.datapoints_waiting + sfxagent.datapoints_in_flight`, although sampling @@ -96,7 +124,7 @@ sfxagent.datapoints_received: sfxagent.datapoints_sent: brief: The total number of datapoints sent by the agent writer since it last started - custom: false + custom: true description: The total number of datapoints sent by the agent writer since it last started metric_type: cumulative @@ -106,7 +134,7 @@ sfxagent.datapoints_sent: sfxagent.datapoints_waiting: brief: The total number of datapoints that have been accepted by the writer but have yet to be sent out to ingest over HTTP - custom: false + custom: true description: The total number of datapoints that have been accepted by the writer but have yet to be sent out to ingest over HTTP. If this continues to grow it indicates that datapoints are not being sent out fast enough and the `writer.maxRequests` @@ -117,7 +145,7 @@ sfxagent.datapoints_waiting: sfxagent.dim_request_senders: brief: Current number of worker goroutines active that can send dimension updates - custom: false + custom: true description: Current number of worker goroutines active that can send dimension updates. metric_type: gauge @@ -126,7 +154,7 @@ sfxagent.dim_request_senders: sfxagent.dim_updates_completed: brief: Total number of dimension property updates successfully completed - custom: false + custom: true description: Total number of dimension property updates successfully completed metric_type: cumulative monitor: internal-metrics @@ -135,7 +163,7 @@ sfxagent.dim_updates_completed: sfxagent.dim_updates_currently_delayed: brief: Current number of dimension updates that are being delayed to avoid sending spurious updates due to flappy dimension property sets - custom: false + custom: true description: Current number of dimension updates that are being delayed to avoid sending spurious updates due to flappy dimension property sets. metric_type: gauge @@ -145,7 +173,7 @@ sfxagent.dim_updates_currently_delayed: sfxagent.dim_updates_dropped: brief: Total number of dimension property updates that were dropped, due to an overfull buffer of dimension updates pending - custom: false + custom: true description: Total number of dimension property updates that were dropped, due to an overfull buffer of dimension updates pending. metric_type: cumulative @@ -154,7 +182,7 @@ sfxagent.dim_updates_dropped: sfxagent.dim_updates_failed: brief: Total number of dimension property updates that failed for some reason - custom: false + custom: true description: Total number of dimension property updates that failed for some reason. The failures should be logged. metric_type: cumulative @@ -164,7 +192,7 @@ sfxagent.dim_updates_failed: sfxagent.dim_updates_flappy_total: brief: Total number of dimension property updates that ended up replacing a dimension property set that was being delayed - custom: false + custom: true description: Total number of dimension property updates that ended up replacing a dimension property set that was being delayed. metric_type: cumulative @@ -174,7 +202,7 @@ sfxagent.dim_updates_flappy_total: sfxagent.dim_updates_started: brief: Total number of dimension property updates requests started, but not necessarily completed or failed - custom: false + custom: true description: Total number of dimension property updates requests started, but not necessarily completed or failed. metric_type: cumulative @@ -183,7 +211,7 @@ sfxagent.dim_updates_started: sfxagent.discovered_endpoints: brief: The number of discovered service endpoints - custom: false + custom: true description: The number of discovered service endpoints. This includes endpoints that do not have any matching monitor configuration discovery rule. metric_type: gauge @@ -193,7 +221,7 @@ sfxagent.discovered_endpoints: sfxagent.events_buffered: brief: The total number of events that have been emitted by monitors but have yet to be sent to SignalFx - custom: false + custom: true description: The total number of events that have been emitted by monitors but have yet to be sent to SignalFx metric_type: gauge @@ -202,7 +230,7 @@ sfxagent.events_buffered: sfxagent.events_sent: brief: The total number of events sent by the agent since it last started - custom: false + custom: true description: The total number of events sent by the agent since it last started metric_type: cumulative monitor: internal-metrics @@ -210,7 +238,7 @@ sfxagent.events_sent: sfxagent.go_frees: brief: Total number of heap objects freed throughout the lifetime of the agent - custom: false + custom: true description: Total number of heap objects freed throughout the lifetime of the agent metric_type: cumulative monitor: internal-metrics @@ -218,7 +246,7 @@ sfxagent.go_frees: sfxagent.go_heap_alloc: brief: Bytes of live heap memory (memory that has been allocated but not freed) - custom: false + custom: true description: Bytes of live heap memory (memory that has been allocated but not freed) metric_type: gauge monitor: internal-metrics @@ -227,7 +255,7 @@ sfxagent.go_heap_alloc: sfxagent.go_heap_idle: brief: Bytes of memory that consist of idle spans (that is, completely empty spans of memory) - custom: false + custom: true description: Bytes of memory that consist of idle spans (that is, completely empty spans of memory) metric_type: gauge @@ -236,7 +264,7 @@ sfxagent.go_heap_idle: sfxagent.go_heap_inuse: brief: Size in bytes of in use spans - custom: false + custom: true description: Size in bytes of in use spans metric_type: gauge monitor: internal-metrics @@ -244,7 +272,7 @@ sfxagent.go_heap_inuse: sfxagent.go_heap_released: brief: Bytes of memory that have been returned to the OS - custom: false + custom: true description: Bytes of memory that have been returned to the OS. This is quite often 0. `sfxagent.go_heap_idle - sfxagent.go_heap_release` is the memory that Go is retaining for future heap allocations. @@ -254,7 +282,7 @@ sfxagent.go_heap_released: sfxagent.go_heap_sys: brief: Virtual memory size in bytes of the agent - custom: false + custom: true description: Virtual memory size in bytes of the agent. This will generally reflect the largest heap size the agent has ever had in its lifetime. metric_type: gauge @@ -263,7 +291,7 @@ sfxagent.go_heap_sys: sfxagent.go_mallocs: brief: Total number of heap objects allocated throughout the lifetime of the agent - custom: false + custom: true description: Total number of heap objects allocated throughout the lifetime of the agent metric_type: cumulative @@ -272,7 +300,7 @@ sfxagent.go_mallocs: sfxagent.go_next_gc: brief: The target heap size -- GC tries to keep the heap smaller than this - custom: false + custom: true description: The target heap size -- GC tries to keep the heap smaller than this metric_type: gauge monitor: internal-metrics @@ -280,15 +308,23 @@ sfxagent.go_next_gc: sfxagent.go_num_gc: brief: The number of GC cycles that have happened in the agent since it started - custom: false + custom: true description: The number of GC cycles that have happened in the agent since it started metric_type: gauge monitor: internal-metrics title: sfxagent.go_num_gc +sfxagent.go_num_goroutine: + brief: Number of goroutines in the agent + custom: true + description: Number of goroutines in the agent + metric_type: gauge + monitor: internal-metrics + title: sfxagent.go_num_goroutine + sfxagent.go_stack_inuse: brief: Size in bytes of spans that have at least one goroutine stack in them - custom: false + custom: true description: Size in bytes of spans that have at least one goroutine stack in them metric_type: gauge monitor: internal-metrics @@ -297,18 +333,10 @@ sfxagent.go_stack_inuse: sfxagent.go_total_alloc: brief: Total number of bytes allocated to the heap throughout the lifetime of the agent - custom: false + custom: true description: Total number of bytes allocated to the heap throughout the lifetime of the agent metric_type: cumulative monitor: internal-metrics title: sfxagent.go_total_alloc -sfxgent.go_num_goroutine: - brief: Number of goroutines in the agent - custom: false - description: Number of goroutines in the agent - metric_type: gauge - monitor: internal-metrics - title: sfxgent.go_num_goroutine - diff --git a/signalfx-org-metrics/metrics.yaml b/signalfx-org-metrics/metrics.yaml index 082366ded..13b2f1ef6 100644 --- a/signalfx-org-metrics/metrics.yaml +++ b/signalfx-org-metrics/metrics.yaml @@ -5,8 +5,8 @@ sf.org.apm.grossContentBytesReceived: The volume of bytes Splunk APM receives off the wire before filtering and throttling. This content could be compressed. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.grossContentBytesReceived @@ -17,8 +17,8 @@ sf.org.apm.grossSpanBytesReceived: The number of bytes Splunk APM receives from spans after decompression but before filtering and throttling. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.grossSpanBytesReceived @@ -30,8 +30,8 @@ sf.org.apm.grossSpanBytesReceivedByToken: The number of bytes Splunk APM receives for a specific access token from ingested span data after decompression but before filtering and throttling. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.grossSpanBytesReceivedByToken @@ -40,8 +40,8 @@ sf.org.apm.grossSpansReceived: description: | The number of spans Splunk APM received before filtering or throttling. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.grossSpansReceived @@ -50,8 +50,8 @@ sf.org.apm.ingestLatency.duration.ns.min: description: | The minimum duration of the ingest latency in Splunk APM. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 seconds metric_type: gauge title: sf.org.apm.ingestLatency.duration.ns.min @@ -72,8 +72,8 @@ sf.org.apm.numAddSpansCalls: description: | The number of calls to the `/v2/trace` endpoint. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numAddSpansCalls @@ -82,8 +82,8 @@ sf.org.apm.numAddSpansCallsByToken: description: | The number of calls to the `/v2/trace` endpoint for a specific access token. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numAddSpansCallsByToken @@ -92,8 +92,8 @@ sf.org.apm.numContainers: description: | The number of containers actively sending data to Splunk APM. - * Dimension(s): `orgId` - * Data resolution: 1 minute + * Dimension(s): `orgId` + * Data resolution: 1 minute metric_type: gauge title: sf.org.apm.numContainers @@ -101,8 +101,8 @@ sf.org.apm.numHosts: brief: The number of hosts that are actively sending data to Splunk APM. description: | The number of hosts that are actively sending data to Splunk APM. - * Dimension(s): `orgId` - * Data resolution: 1 minute + * Dimension(s): `orgId` + * Data resolution: 1 minute metric_type: gauge title: sf.org.apm.numHosts @@ -113,8 +113,8 @@ sf.org.apm.numSpanBytesReceived: The number of bytes Splunk APM accepts from ingested span data after decompression after filtering and throttling. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numSpanBytesReceived @@ -125,8 +125,8 @@ sf.org.apm.numSpanBytesReceivedByToken: The number of bytes Splunk APM accepts for a specific access token for a span after decompression after filtering and throttling. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numSpanBytesReceivedByToken @@ -137,8 +137,8 @@ sf.org.apm.numSpansDroppedInvalid: invalid if there is no start time, trace ID, or service associated with it, or if there are too many spans in the trace. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numSpansDroppedInvalid @@ -150,8 +150,8 @@ sf.org.apm.numSpansDroppedInvalidByToken: access token. A span can be invalid if there is no start time, trace ID, or service associated with it, or if there are too many spans in the trace. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numSpansDroppedInvalidByToken @@ -160,8 +160,8 @@ sf.org.apm.numSpansDroppedOversize: description: | The number of spans Splunk APM receives that are too large to process. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numSpansDroppedOversize @@ -172,8 +172,8 @@ sf.org.apm.numSpansDroppedOversizeByToken: The number of spans Splunk APM receives that are too large to process for a specific access token. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numSpansDroppedOversizeByToken @@ -183,8 +183,8 @@ sf.org.apm.numSpansDroppedThrottle: The number of spans Splunk APM dropped after you exceeded the allowed ingest volume. Splunk APM drops spans it receives after the ingestion volume limit is reached. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numSpansDroppedThrottle @@ -196,8 +196,8 @@ sf.org.apm.numSpansDroppedThrottleByToken: beyond the allowed ingest volume. Splunk APM drops spans it receives after the ingestion volume limit is reached. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numSpansDroppedThrottleByToken @@ -206,8 +206,8 @@ sf.org.apm.numSpansReceived: description: | The number of spans Splunk APM accepts after filtering and throttling. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numSpansReceived @@ -218,8 +218,8 @@ sf.org.apm.numSpansReceivedByToken: The number of spans Splunk APM received for a specific access token after filtering and throttling. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numSpansReceivedByToken @@ -230,8 +230,8 @@ sf.org.apm.grossSpansReceivedByToken: The number of spans Splunk APM receives for a specific access token before filtering or throttling. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.grossSpansReceivedByToken @@ -240,8 +240,8 @@ sf.org.apm.subscription.containers: description: | The entitlement for the number of containers for your subscription plan. - * Dimension(s): `orgId` - * Data resolution: 2 minutes + * Dimension(s): `orgId` + * Data resolution: 2 minutes metric_type: gauge title: sf.org.apm.subscription.containers @@ -250,8 +250,8 @@ sf.org.apm.subscription.hosts: description: | The entitlement for the number of hosts for your subscription plan. - * Dimension(s): `orgId` - * Data resolution: 2 minutes + * Dimension(s): `orgId` + * Data resolution: 2 minutes metric_type: gauge title: sf.org.apm.subscription.hosts @@ -260,8 +260,8 @@ sf.org.apm.subscription.monitoringMetricSets: description: | The entitlement for the number of Monitoring MetricSets as part of your subscription plan. - * Dimension(s): `orgId` - * Data resolution: 2 minutes + * Dimension(s): `orgId` + * Data resolution: 2 minutes metric_type: gauge title: sf.org.apm.subscription.monitoringMetricSets @@ -270,8 +270,8 @@ sf.org.apm.subscription.spanBytes: description: | The entitlement for the number of bytes per minutes for your subscription plan. - * Dimension(s): `orgId` - * Data resolution: 2 minutes + * Dimension(s): `orgId` + * Data resolution: 2 minutes metric_type: gauge title: sf.org.apm.subscription.spanBytes @@ -280,8 +280,8 @@ sf.org.apm.subscription.traces: description: | The entitlement for the number of traces analyzed per minute (TAPM) as part of your subscription plan. - * Dimension(s): `orgId` - * Data resolution: 2 minutes + * Dimension(s): `orgId` + * Data resolution: 2 minutes metric_type: gauge title: sf.org.apm.subscription.traces @@ -290,8 +290,8 @@ sf.org.apm.subscription.troubleshootingMetricSets: description: | The entitlement for the number of Troubleshooting MetricSets as part of your subscription plan. - * Dimension(s): `orgId` - * Data resolution: 2 minutes + * Dimension(s): `orgId` + * Data resolution: 2 minutes metric_type: gauge title: sf.org.apm.subscription.troubleshootingMetricSets @@ -300,8 +300,8 @@ sf.org.apm.numTracesReceived: description: | The number of traces Splunk APM receives and processes. - * Dimension(s): `orgId` - * Data resolution: 1 minute + * Dimension(s): `orgId` + * Data resolution: 1 minute metric_type: counter title: sf.org.apm.numTracesReceived @@ -310,8 +310,8 @@ sf.org.apm.numTroubleshootingMetricSets10s: description: | The cardinality of Troubleshooting MetricSets for each 10-second window. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 seconds metric_type: gauge title: sf.org.apm.numTroubleshootingMetricSets10s @@ -320,8 +320,8 @@ sf.org.apm.numTroubleshootingMetricSets: description: | The cardinality of Troubleshooting MetricSets for each 1-minute time window. - * Dimension(s): `orgId` - * Data resolution: 1 minute + * Dimension(s): `orgId` + * Data resolution: 1 minute metric_type: counter title: sf.org.apm.numTroubleshootingMetricSets @@ -332,8 +332,8 @@ sf.org.apm.numContentBytesReceived: The volume of bytes Splunk APM accepts off the wire after filtering and throttling. This content could be compressed. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numContentBytesReceived @@ -344,8 +344,8 @@ sf.org.apm.grossContentBytesReceivedByToken: The volume of bytes Splunk APM receives off the wire for a specific access token before filtering and throttling. This content could be compressed. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.grossContentBytesReceivedByToken @@ -356,8 +356,8 @@ sf.org.apm.numContentBytesReceivedByToken: The volume of bytes Splunk APM accepts off the wire for a specific access token after filtering and throttling. This content could be compressed. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.apm.numContentBytesReceivedByToken @@ -368,24 +368,14 @@ sf.org.abortedDetectors: or data points, whichever caused the detector to stop metric_type: cumulative title: sf.org.abortedDetectors - -sf.org.cloud.grossDpmContentBytesReceived: - brief: Number of content bytes received, but not necessarily admitted, for Cloud services - description: | - Number of content bytes received, but not necessarily admitted, for Cloud services - - * Dimension(s): `orgId` - * Data resolution: 10 seconds - metric_type: counter - title: sf.org.cloud.grossDpmContentBytesReceived sf.org.datapointsTotalCountByToken: brief: Total datapoints by token description: | Total datapoints by token - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 10 seconds + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 10 seconds metric_type: counter title: sf.org.datapointsTotalCountByToken @@ -394,8 +384,8 @@ sf.org.datainventory.datapointsAdded: description: | Number of data points retrieved from the AWS integration - * Dimension(s): `type: AWS`, `service`, `region`, `integrationId` - * Data resolution: 10 seconds + * Dimension(s): `type: AWS`, `service`, `region`, `integrationId` + * Data resolution: 10 seconds metric_type: count title: sf.org.datainventory.datapointsAdded @@ -404,8 +394,8 @@ sf.org.datainventory.latestTimestamp: description: | Timestamp of the last data point retrieved from an integration - * Dimension(s): `type: AWS`, `service`, `region`, `integrationId` - * Data resolution: 10 seconds + * Dimension(s): `type: AWS`, `service`, `region`, `integrationId` + * Data resolution: 10 seconds metric_type: gauge title: sf.org.datainventory.latestTimestamp @@ -414,8 +404,8 @@ sf.org.datainventory.mtses: description: | Number of Metric Time Series retrieved from an AWS integration. It’s only generated when AWS metrics are polled, and is not available with Metric Streams - * Dimension(s): `type: AWS`, `service`, `region`, `integrationId` - * Data resolution: 10 seconds + * Dimension(s): `type: AWS`, `service`, `region`, `integrationId` + * Data resolution: 10 seconds metric_type: gauge title: sf.org.datainventory.mtses @@ -424,26 +414,26 @@ sf.org.datainventory.resources: description: | Number of AWS resources polled by an AWS integration. It's only generated when AWS metrics are polled, and is not available with Metric Streams - * Dimension(s): `type: AWS`, `service`, `region`, `integrationId` - * Data resolution: 10 seconds + * Dimension(s): `type: AWS`, `service`, `region`, `integrationId` + * Data resolution: 10 seconds metric_type: gauge title: sf.org.datainventory.resources sf.org.grossDatapointsReceived: - brief: Number of data points received, but not necessarily admitted - description: Number of data points received, but not necessarily admitted. + brief: Infrastructure Monitoring internal metric + description: sf.org.grossDatapointsReceived is reserved for Infrastructure Monitoring internal use only. metric_type: cumulative title: sf.org.grossDatapointsReceived sf.org.grossDatapointsReceivedByToken: - brief: Number of data points received, but not necessarily admitted, per token - description: Number of data points receuved, but not necessarily admitted, per token. + brief: Infrastructure Monitoring internal metric + description: sf.org.grossDatapointsReceivedByToken is reserved for Infrastructure Monitoring internal use only. metric_type: cumulative title: sf.org.grossDatapointsReceivedByToken sf.org.grossDpmContentBytesReceived: - brief: Number of content bytes received, but not necessarily admitted - description: Number of content bytes received, but not necessarily admitted. + brief: Infrastructure Monitoring internal metric + description: sf.org.grossDpmContentBytesReceived is reserved for Infrastructure Monitoring internal use only. metric_type: counter title: sf.org.grossDpmContentBytesReceived @@ -519,8 +509,8 @@ sf.org.log.grossContentBytesReceived: description: | The volume of bytes Splunk Log Observer receives from ingesting logs off the wire before filtering and throttling. This content can be compressed. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.grossContentBytesReceived @@ -529,8 +519,8 @@ sf.org.log.grossContentBytesReceivedByToken: description: | The volume of bytes Splunk Log Observer receives from ingesting logs off the wire for a specific access token after filtering and throttling. This content can be compressed. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.grossContentBytesReceivedByToken @@ -539,8 +529,8 @@ sf.org.log.grossMessageBytesReceived: description: | The number of bytes Splunk Log Observer receives from ingested logs after decompression but before filtering and throttling are complete. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.grossMessageBytesReceived @@ -549,8 +539,8 @@ sf.org.log.grossMessageBytesReceivedByToken: description: | The number of bytes received by Splunk Log Observer for a specific access token from ingested logs after decompression but before filtering and throttling are complete. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 10 seconds + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.grossMessageBytesReceivedByToken @@ -559,8 +549,8 @@ sf.org.log.grossMessagesReceived: description: | The total number of log messages Splunk Log Observer receives before filtering and throttling. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.grossMessagesReceived @@ -569,8 +559,8 @@ sf.org.log.grossMessagesReceivedByToken: description: | The total number of log messages Splunk Log Observer receives for a specific access token before filtering and throttling. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 10 seconds + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.grossMessagesReceivedByToken @@ -579,8 +569,8 @@ sf.org.log.numContentBytesReceived: description: | The volume of bytes Splunk Log Observer receives from ingesting logs off the wire after filtering and throttling. This content can be compressed. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.numContentBytesReceived @@ -589,8 +579,8 @@ sf.org.log.numContentBytesReceivedByToken: description: | The volume of bytes Splunk Log Observer receives from ingesting logs off the wire for a specific access token after filtering and throttling. This content can be compressed. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.numContentBytesReceivedByToken @@ -599,8 +589,8 @@ sf.org.log.numLogsDroppedIndexThrottle: description: | The number of logs Splunk Log Observer drops after the organization's allowed logs index limit threshold is met. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.numLogsDroppedIndexThrottle @@ -609,8 +599,8 @@ sf.org.log.numMessageBytesReceived: description: | The number of bytes Splunk Log Observer receives from ingested logs after decompression, filtering, and throttling are complete. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.numMessageBytesReceived @@ -619,8 +609,8 @@ sf.org.log.numMessageBytesReceivedByToken: description: | The number of bytes Splunk Log Observer receives for a specific access token from ingested logs after decompression, filtering, and throttling are complete. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.numMessageBytesReceivedByToken @@ -629,8 +619,8 @@ sf.org.log.numMessagesDroppedThrottle: description: | The number of log messages Splunk Log Observer drops after the allowed ingest volume is exceeded. Splunk Log Observer drops messages it receives after the ingestion volume is reached. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.numMessagesDroppedThrottle @@ -639,8 +629,8 @@ sf.org.log.numMessagesDroppedOversize: description: | The number of log messages Splunk Log Observer receives that are too large to process. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.numMessagesDroppedOversize @@ -649,8 +639,8 @@ sf.org.log.numMessagesDroppedOversizeByToken: description: | The number of log messages Splunk Log Observer receives that are too large to process for a specific access token. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.numMessagesDroppedOversizeByToken @@ -659,8 +649,8 @@ sf.org.log.numMessagesDroppedThrottleByToken: description: | The number of log messages Splunk Log Observer drops for a specific access token after the allowed ingest volume is exceeded. Splunk Log Observer drops messages it receives after the ingestion volume is reached. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.numMessagesDroppedThrottleByToken @@ -669,8 +659,8 @@ sf.org.log.numMessagesReceived: description: | The total number of log messages Splunk Log Observer accepts after filtering and throttling. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.numMessagesReceived @@ -679,8 +669,8 @@ sf.org.log.numMessagesReceivedByToken: description: | The total number of log messages Splunk Log Observer accepts for a specific access token after filtering and throttling. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 second metric_type: counter title: sf.org.log.numMessagesReceivedByToken @@ -689,8 +679,8 @@ sf.org.log.numMessageBytesIndexed: description: | Bytes of data stored in the Log Observer index after applying pipeline management rules. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 seconds metric_type: counter title: sf.org.log.numMessageBytesIndexed @@ -700,8 +690,8 @@ sf.org.num.alertmuting: Total number of alert muting rules; includes rules currently in effect and rules not currently in effect (e.g. scheduled muting rules). - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.alertmuting @@ -709,9 +699,9 @@ sf.org.num.awsServiceAuthErrorCount: brief: Authentication errors thrown by AWS services description: | Total number authentication errors thrown by AWS services. - - * Dimension(s): `integrationId`, `orgId`, `namespace` (AWS Cloudwatch namespace),`clientInterface` (AWS SDK Interface), `method` (AWS SDK method) - * Data resolution: 1 second + + * Dimension(s): `integrationId`,`orgId`,`namespace` (AWS Cloudwatch namespace),`clientInterface` (AWS SDK Interface),`method` (AWS SDK method) + * Data resolution: 1 second metric_type: gauge title: sf.org.num.awsServiceAuthErrorCount @@ -720,9 +710,9 @@ sf.org.num.awsServiceCallCount: description: | Total number of calls made to the Amazon API. - * Dimension(s): `orgId`, `namespace` (AWS service, such as `AWS/Cloudwatch`), `method` - (the API being called, such as `getMetricStatistics`) - * Data resolution: 5 seconds + * Dimension(s): `orgId`, `namespace` (AWS service, such as `AWS/Cloudwatch`), `method` + (the API being called, such as `getMetricStatistics`) + * Data resolution: 5 seconds metric_type: gauge title: sf.org.num.awsServiceCallCount @@ -731,10 +721,10 @@ sf.org.num.awsServiceCallCountExceptions: description: | Number of calls made to the Amazon API that threw exceptions. - * Dimension(s): `orgId`, `namespace` (AWS service, such as AWS/Cloudwatch), `method` - (the API being called, such as `getMetricStatistics`), `location` (where the exception - occurred, either `client` or `server`) - * Data resolution: 5 seconds + * Dimension(s): `orgId`, `namespace` (AWS service, such as AWS/Cloudwatch), `method` + (the API being called, such as `getMetricStatistics`), `location` (where the exception + occurred, either `client` or `server`) + * Data resolution: 5 seconds metric_type: gauge title: sf.org.num.awsServiceCallCountExceptions @@ -744,19 +734,58 @@ sf.org.num.awsServiceCallThrottles: Number of calls made to the Amazon API that are being throttled by AWS because you have exceeded your AWS API Call limits. - * Dimension(s): `orgId`, `namespace` (AWS service, such as AWS/Cloudwatch), `method` - (the API being called, such as `getMetricStatistics`) - * Data resolution: 5 seconds + * Dimension(s): `orgId`, `namespace` (AWS service, such as AWS/Cloudwatch), `method` + (the API being called, such as `getMetricStatistics`) + * Data resolution: 5 seconds metric_type: gauge title: sf.org.num.awsServiceCallThrottles + +sf.org.num.azureMonitorClientCallCount (DEPRECATED): + brief: Number of calls made to the Azure Monitor API + description: | + THIS METRIC IS DEPRECATED AND WILL BE REMOVED. + PLEASE USE `sf.org.num.azureServiceClientCallCount` INSTEAD + Total number of calls made to the Azure Monitor API. + + * Dimension(s): `orgId`, `subscription_id`, `type` (`metric_data_sync` for metric + syncing and `metric_metadata_sync` for property syncing) + * Data resolution: 5 seconds + metric_type: counter + title: sf.org.num.azureMonitorClientCallCount + +sf.org.num.azureMonitorClientCallCountErrors (DEPRECATED): + brief: Number of calls to Azure Monitor API method that threw errors + description: | + THIS METRIC IS DEPRECATED AND WILL BE REMOVED. + PLEASE USE `sf.org.num.azureServiceClientCallCountErrors` INSTEAD + Number of calls to Azure Monitor API method that threw errors. + + * Dimension(s): `orgId`, `subscription_id`, `type` (`metric_data_sync` for metric + syncing and `metric_metadata_sync` for property syncing) + * Data resolution: 1 second + metric_type: counter + title: sf.org.num.azureMonitorClientCallCountErrors + +sf.org.num.azureMonitorClientCallCountThrottles (DEPRECATED): + brief: Number of calls to Azure Monitor method that were throttled + description: | + THIS METRIC IS DEPRECATED AND WILL BE REMOVED. + PLEASE USE `sf.org.num.azureServiceClientCallCountThrottles` INSTEAD + Number of calls to Azure Monitor method that were throttled. + + * Dimension(s): `orgId`, `subscription_id`, `type` (`metric_data_sync` for metric + syncing and `metric_metadata_sync` for property syncing) + * Data resolution: 1 second + metric_type: counter + title: sf.org.num.azureMonitorClientCallCountThrottles sf.org.num.azureServiceClientCallCount: brief: Number of calls made to the Azure API description: | Total number of calls made to the Azure API. - * Dimension(s): `orgId`, `integrationId`, `subscriptionId`, `api` (`AzureMonitor` or `AzureResourceManager`), `method` (only for `api`=`AzureMonitor`, the API being called, such as `getTimeseries`), `resource` (only for `api`=`AzureResourceMonitor`, resource asked for, such as `microsoft.compute/virtualmachines`) - * Data resolution: 5 seconds + * Dimension(s): `orgId`, `integrationId`, `subscriptionId`, `api` (`AzureMonitor` or `AzureResourceManager`), `method` (only for `api`=`AzureMonitor`, the API being called, such as `getTimeseries`), `resource` (only for `api`=`AzureResourceMonitor`, resource asked for, such as `microsoft.compute/virtualmachines`) + * Data resolution: 5 seconds metric_type: counter title: sf.org.num.azureServiceClientCallCount @@ -765,8 +794,8 @@ sf.org.num.azureServiceClientCallCountErrors: description: | Number of calls to Azure API that threw errors. - * Dimension(s): `orgId`, `integrationId`, `subscriptionId`, `api` (`AzureMonitor` or `AzureResourceManager`), `method` (only for `api`=`AzureMonitor`, the API being called, such as `getTimeseries`), `resource` (only for `api`=`AzureResourceMonitor`, resource asked for, such as `microsoft.compute/virtualmachines`) - * Data resolution: 1 second + * Dimension(s): `orgId`, `integrationId`, `subscriptionId`, `api` (`AzureMonitor` or `AzureResourceManager`), `method` (only for `api`=`AzureMonitor`, the API being called, such as `getTimeseries`), `resource` (only for `api`=`AzureResourceMonitor`, resource asked for, such as `microsoft.compute/virtualmachines`) + * Data resolution: 1 second metric_type: counter title: sf.org.num.azureServiceClientCallCountErrors @@ -775,8 +804,8 @@ sf.org.num.azureServiceClientCallCountThrottles: description: | Number of calls to Azure API that were throttled. - * Dimension(s): `orgId`, `integrationId`, `subscriptionId`, `api` (`AzureMonitor` or `AzureResourceManager`), `method` (only for `api`=`AzureMonitor`, the API being called, such as `getTimeseries`), `resource` (only for `api`=`AzureResourceMonitor`, resource asked for, such as `microsoft.compute/virtualmachines`) - * Data resolution: 1 second + * Dimension(s): `orgId`, `integrationId`, `subscriptionId`, `api` (`AzureMonitor` or `AzureResourceManager`), `method` (only for `api`=`AzureMonitor`, the API being called, such as `getTimeseries`), `resource` (only for `api`=`AzureResourceMonitor`, resource asked for, such as `microsoft.compute/virtualmachines`) + * Data resolution: 1 second metric_type: counter title: sf.org.num.azureServiceClientCallCountThrottles @@ -786,8 +815,8 @@ sf.org.num.chart: Total number of charts; includes any charts created using the Infrastructure Monitoring API but not associated with a dashboard. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.chart @@ -798,8 +827,8 @@ sf.org.num.credentials: in Infrastructure Monitoring, for the purpose of retrieving metrics (example: AWS Cloudwatch) or sending alerts (example: PagerDuty). - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.credentials @@ -808,8 +837,8 @@ sf.org.num.crosslink: description: | Total number of crosslinks. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.crosslink @@ -819,8 +848,8 @@ sf.org.num.dashboard: Total number of dashboards; includes all user, custom, and built-in dashboards - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.dashboard @@ -829,8 +858,8 @@ sf.org.num.detector: description: | Total number of detectors; includes detectors that are muted. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.detector @@ -839,14 +868,14 @@ sf.org.numDetectorsAborted: description: | Number of detector jobs stopped because they reached a resource limit (usually MTS limit). - * Dimension(s): `orgId` - * Data resolution: 5 seconds + * Dimension(s): `orgId` + * Data resolution: 5 seconds metric_type: counter title: sf.org.numDetectorsAborted sf.org.num.detectortemplate: - brief: Number of detector templates - description: Number of detector templates. + brief: Infrastructure Monitoring internal metric + description: sf.org.num.detectortemplate is reserved for Infrastructure Monitoring internal use only. metric_type: cumulative title: sf.org.num.detectortemplate @@ -855,8 +884,8 @@ sf.org.num.dimension: description: | Number of unique dimensions across all time series. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.dimension @@ -867,8 +896,8 @@ sf.org.num.eventtimeseries: Number of event time series (ETS) available to be visualized in charts and detectors. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.eventtimeseries @@ -877,18 +906,54 @@ sf.org.num.eventtype: description: | Number of unique event types across all ETS. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.eventtype +sf.org.num.gcpStackdriverClientCallCount (DEPRECATED): + brief: Number of calls to each Stackdriver client method + description: | + THIS METRIC IS DEPRECATED AND WILL BE REMOVED. + PLEASE USE `sf.org.num.gcpServiceClientCallCount` INSTEAD + Number of calls to each Stackdriver client method. + + * Dimension(s): `orgId`, `project_id`, `method` (the API being called, such as `getTimeSeries`) + * Data resolution: 1 second + metric_type: counter + title: sf.org.num.gcpStackdriverClientCallCount + +sf.org.num.gcpStackdriverClientCallCountErrors (DEPRECATED): + brief: Number of calls to each Stackdriver client method that threw errors + description: | + THIS METRIC IS DEPRECATED AND WILL BE REMOVED. + PLEASE USE `sf.org.num.gcpServiceClientCallCountErrors` INSTEAD + Number of calls to each Stackdriver client method that threw errors. + + * Dimension(s): `orgId`, `project_id`, `method` (the API being called, such as `getTimeSeries`) + * Data resolution: 1 second + metric_type: counter + title: sf.org.num.gcpStackdriverClientCallCountErrors + +sf.org.num.gcpStackdriverClientCallCountThrottles (DEPRECATED): + brief: Number of calls to each Stackdriver client method that were throttled + description: | + THIS METRIC IS DEPRECATED AND WILL BE REMOVED. + PLEASE USE `sf.org.num.gcpServiceClientCallCountThrottles` INSTEAD + Number of calls to each Stackdriver client method that were throttled. + + * Dimension(s): `orgId`, `project_id`, `method` (the API being called, such as `getTimeSeries`) + * Data resolution: 1 second + metric_type: counter + title: sf.org.num.gcpStackdriverClientCallCountThrottles + sf.org.num.gcpServiceClientCallCount: brief: Number of calls to each GCP API client method description: | Number of calls to each GCP API client method. - * Dimension(s): `orgId`, `project_id`, `method` (the API being called, such as `getTimeSeries`) - * Data resolution: 1 second + * Dimension(s): `orgId`, `project_id`, `method` (the API being called, such as `getTimeSeries`) + * Data resolution: 1 second metric_type: counter title: sf.org.num.gcpSeviceClientCallCount @@ -897,8 +962,8 @@ sf.org.num.gcpServiceClientCallCountErrors: description: | Number of calls to each GCP API client method that threw errors. - * Dimension(s): `orgId`, `project_id`, `method` (the API being called, such as `getTimeSeries`) - * Data resolution: 1 second + * Dimension(s): `orgId`, `project_id`, `method` (the API being called, such as `getTimeSeries`) + * Data resolution: 1 second metric_type: counter title: sf.org.num.gcpServiceClientCallCountErrors @@ -907,8 +972,8 @@ sf.org.num.gcpServiceClientCallCountThrottles: description: | Number of calls to each GCP API client method that were throttled. - * Dimension(s): `orgId`, `project_id`, `method` (the API being called, such as `getTimeSeries`) - * Data resolution: 1 second + * Dimension(s): `orgId`, `project_id`, `method` (the API being called, such as `getTimeSeries`) + * Data resolution: 1 second metric_type: counter title: sf.org.num.gcpServiceClientCallCountThrottles @@ -917,8 +982,8 @@ sf.org.num.metric: description: | Number of unique metrics across all MTS. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.metric @@ -928,8 +993,8 @@ sf.org.num.metrictimeseries: Number of metric time series (MTS) available to be visualized in charts and detectors. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.metrictimeseries @@ -940,8 +1005,8 @@ sf.org.num.migrationmarker: title: sf.org.num.migrationmarker sf.org.num.mutingactive: - brief: Number of active muting rules - description: Number of active muting rules + brief: Number of active muting rules NBED + description: Number of active muting rules NBED. metric_type: gauge title: sf.org.num.mutingactive @@ -950,8 +1015,8 @@ sf.org.num.namedtoken: description: | Number of organization access tokens, including disabled tokens. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.namedtoken @@ -960,8 +1025,8 @@ sf.org.num.navigator: description: | Number of options available in the sidebar in the Infrastructure Navigator. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.navigator @@ -983,8 +1048,8 @@ sf.org.num.orguser: Total number of members and admins associated with an organization; includes invited users who have not yet logged in. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.orguser @@ -994,8 +1059,8 @@ sf.org.num.page: Total number of dashboard groups; includes user, custom, and built-in dashboard groups. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.page @@ -1005,8 +1070,8 @@ sf.org.num.property: Number of properties; includes only properties you have created, not dimensions. For the latter, use ``sf.org.num.dimension``. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.property @@ -1014,8 +1079,8 @@ sf.org.num.role: brief: Number of roles in your system description: | Number of roles in your system. - * Dimension(s): `orgId` - * Data resolution: 63 minutes + * Dimension(s): `orgId` + * Data resolution: 63 minutes metric_type: gauge title: sf.org.num.role @@ -1042,8 +1107,8 @@ sf.org.num.tag: description: | Number of tags available for use. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.tag @@ -1052,8 +1117,8 @@ sf.org.num.team: description: | Number of teams in the organization. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.team @@ -1064,8 +1129,8 @@ sf.org.num.teammember: total number of users. For example, if team A has 30 members and team B has 20 members, the value is 50 even if there are only 30 members in the org. - * Dimension(s): `orgId` - * Data resolution: 15 minutes + * Dimension(s): `orgId` + * Data resolution: 15 minutes metric_type: gauge title: sf.org.num.teammember @@ -1088,8 +1153,8 @@ sf.org.numAddDatapointCalls: description: | Number of calls to send data points to Infrastructure Monitoring. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.numAddDatapointCalls @@ -1100,8 +1165,8 @@ sf.org.numAddDatapointCallsByToken: The sum of all the values might be less than the value of `sf.org.numAddDatapointCalls`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.numAddDatapointCallsByToken @@ -1110,8 +1175,8 @@ sf.org.numAddEventsCalls: description: | Number of calls to send custom events to Infrastructure Monitoring. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.numAddEventsCalls @@ -1122,8 +1187,8 @@ sf.org.numAddEventsCallsByToken: The sum of all the values might be less than the value of `sf.org.numAddEventsCalls`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.numAddEventsCallsByToken @@ -1132,28 +1197,28 @@ sf.org.numApmApiCalls: description: | Number of calls made to APM’s public APIs. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 seconds metric_type: counter title: sf.org.numApmApiCalls sf.org.numApmBundledMetrics: - brief: Number of APM Bundled Metrics for your org + brief: APM Bundled Metrics limit for your org description: | - Number of APM Bundled Metrics for your org. + APM Bundled Metrics limit for your org. - * Dimension(s): `orgId` - * Data resolution: 10 minutes + * Dimension(s): `orgId` + * Data resolution: 10 minutes metric_type: gauge title: sf.org.numApmBundledMetrics sf.org.numApmBundledMetricsByToken: - brief: Number of APM Bundled Metrics for your org, for a token + brief: APM Bundled Metrics limit for your org, for a token description: | - Number of APM Bundled Metrics for your org for a specific token. + APM Bundled Metrics limit for your org for a specific token. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 10 minutes + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 10 minutes metric_type: gauge title: sf.org.numApmBundledMetricsByToken @@ -1167,8 +1232,8 @@ sf.org.numBackfillCalls: "cumulative_counter", or "gauge". To learn more, see [Metrics with values for each metric type](#metrics-with-values-for-each-metric-type). - * Dimension(s): `category, orgId` - * Data resolution: 1 second + * Dimension(s): `category, orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.numBackfillCalls @@ -1184,8 +1249,8 @@ sf.org.numBackfillCallsByToken: The sum of all the values might be less than the value of `sf.org.numBackfillCalls`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `category`, `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `category`, `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.numBackfillCallsByToken @@ -1204,8 +1269,8 @@ sf.org.numBadDimensionMetricTimeSeriesCreateCalls: description: | Number of calls to create MTS that have failed due to an error with dimensions. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 seconds metric_type: counter title: sf.org.numBadDimensionMetricTimeSeriesCreateCalls @@ -1214,8 +1279,8 @@ sf.org.numBadDimensionMetricTimeSeriesCreateCallsByToken: description: | Number of calls to create MTS that have failed due to an error with dimensions, per token. - * Dimension(s): `orgId` - * Data resolution: 35 seconds + * Dimension(s): `orgId` + * Data resolution: 35 seconds metric_type: counter title: sf.org.numBadDimensionMetricTimeSeriesCreateCallsByToken @@ -1224,8 +1289,8 @@ sf.org.numBadMetricMetricTimeSeriesCreateCalls: description: | Number of calls to create MTS that have failed due to an error with metrics. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 seconds metric_type: counter title: sf.org.numBadMetricMetricTimeSeriesCreateCalls @@ -1234,8 +1299,8 @@ sf.org.numBadMetricMetricTimeSeriesCreateCallsByToken: description: | Number of calls to create MTS that have failed due to an error with metrics, per token. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 35 seconds + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 35 seconds metric_type: counter title: sf.org.numBadMetricMetricTimeSeriesCreateCallsByToken @@ -1244,59 +1309,20 @@ sf.org.numCustomMetrics: description: | Number of custom metrics monitored by Infrastructure Monitoring. - * Dimension(s): `orgId` - * Data resolution: 10 minutes + * Dimension(s): `orgId` + * Data resolution: 10 minutes metric_type: counter title: sf.org.numCustomMetrics -sf.org.numComputationsStarted: - brief: Rate at which you're starting new SignalFlow jobs - description: | - Number of SignalFlow computations, which mostly consist of chart views and detectors, started. Use this metric to know the rate at which you're starting new SignalFlow jobs. - - * Dimension(s): `orgId` - * Data resolution: 10 seconds - metric_type: counter - title: sf.org.numComputationsStarted - -sf.org.numComputationsStartedByToken: - brief: Rate at which you're starting new SignalFlow jobs, per token - description: | - One value per token. Number of SignalFlow computations, which mostly consist of chart views and detectors, started. Use this metric to know the rate at which you're starting new SignalFlow jobs. - - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 10 seconds - metric_type: counter - title: sf.org.numComputationsStartedByToken - -sf.org.numComputationsThrottled: - brief: Rate at which SignalFlow jobs are being throttled - description: | - The number of SignalFlow computations, which mostly consist of chart views and detectors, throttled because you reach the maximum number of SignalFlow jobs you can run for your organization. To learn more about this limit, see [Maximum number of SignalFlow jobs per organization](https://docs.splunk.com/observability/admin/references/system-limits/sys-limits-infra.html#maximum-number-of-signalflow-jobs-per-organization). - - * Dimension(s): `orgId` - * Data resolution: 10 seconds - metric_type: counter - title: sf.org.numComputationsThrottled -sf.org.numComputationsThrottledByToken: - brief: Rate at which SignalFlow jobs are being throttled - description: | - One value per token. The number of computations, which mostly consist of chart views and detectors, throttled because you reach the maximum number of SignalFlow jobs you can run for your organization. To learn more about this limit, see [Maximum number of SignalFlow jobs per organization](https://docs.splunk.com/observability/admin/references/system-limits/sys-limits-infra.html#maximum-number-of-signalflow-jobs-per-organization). - - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 10 seconds - metric_type: counter - title: sf.org.numComputationsThrottledByToken - sf.org.numCustomMetricsByToken: brief: Per token number of custom metrics monitored description: | - One value per token; number of custom metrics monitored by Splunk Observability Cloud. + One value per token; number of custom metrics monitored by Infrastructure Monitoring. The sum of all the values might be less than the value of `sf.org.numCustomMetrics`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 10 minutes + * Dimension(s): `orgId, tokenId` + * Data resolution: 10 minutes metric_type: counter title: sf.org.numCustomMetricsByToken @@ -1310,8 +1336,8 @@ sf.org.numDatapointsBackfilled: named `category` with a value of "counter", "cumulative_counter", or "gauge". To learn more, see [Metrics with values for each metric type](#metrics-with-values-for-each-metric-type). - * Dimension(s): `category`, `orgId` - * Data resolution: 1 second + * Dimension(s): `category, orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.numDatapointsBackfilled @@ -1326,8 +1352,8 @@ sf.org.numDatapointsBackfilledByToken: The sum of all the values might be less than the value of `sf.org.numDatapointsBackfilled`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `category`, `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `category, orgId, tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.numDatapointsBackfilledByToken @@ -1336,8 +1362,8 @@ sf.org.numDatapointsDroppedBatchSize: description: | Number of data points dropped because a single request contained more than 100,000 data points. In this scenario, Observability Cloud drops data points because it perceives sending more than 100,000 data points in a single request as excessive. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 seconds metric_type: counter title: sf.org.numDatapointsDroppedBatchSize @@ -1346,9 +1372,8 @@ sf.org.numDatapointsDroppedBatchSizeByToken: description: | Number of data points dropped because a single request contained more than 100,000 data points, per token. In this scenario, Observability Cloud drops data points because it perceives sending more than 100,000 data points in a single request as excessive. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 40 seconds - + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 40 seconds metric_type: counter title: sf.org.numDatapointsDroppedBatchSizeByToken @@ -1356,11 +1381,13 @@ sf.org.numDatapointsDroppedExceededQuota: brief: Number of new data points not processed by Infrastructure Monitoring; exceeded subscription limit description: | - Total number of new data points you sent to Infrastructure Monitoring but that Infrastructure Monitoring didn't accept, because your organization exceeded its subscription limit. - To learn more, see [Exceeding your system limits](https://docs.splunk.com/Observability/admin/subscription-usage/dpm-usage.html#exceeding-your-system-limits). + Total number of new data points you sent to Infrastructure Monitoring but that Infrastructure Monitoring + didn't accept, because your organization exceeded its subscription limit. + To learn more about the process Infrastructure Monitoring uses for incoming data when you exceed subscription + limits, see [this FAQ](https://docs.signalfx.com/en/latest/_sidebars-and-includes/dpm-faq.html). - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.numDatapointsDroppedExceededQuota @@ -1370,12 +1397,14 @@ sf.org.numDatapointsDroppedExceededQuotaByToken: description: | One value per token; number of new data points you sent to Infrastructure Monitoring but that Infrastructure Monitoring didn't accept, because your organization exceeded its subscription - limit. To learn more, see [Exceeding your system limits](https://docs.splunk.com/Observability/admin/subscription-usage/dpm-usage.html#exceeding-your-system-limits). - - The sum of all the values might be less than the value of `sf.org.numDatapointsDroppedExceededQuota`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). + limit. + To learn more about the process Infrastructure Monitoring uses for incoming data when you exceed subscription + limits, see [DPM Limits - FAQ](https://docs.signalfx.com/en/latest/_sidebars-and-includes/dpm-faq.html). + The sum of all the values might be less than the value of `sf.org.numDatapointsDroppedExceededQuota`. + To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId, tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.numDatapointsDroppedExceededQuotaByToken @@ -1384,8 +1413,8 @@ sf.org.numDatapointsDroppedInvalid: description: | Number of data points dropped because they didn't follow documented guidelines for data points. For example, the metric name was too long, the metric name included unsupported characters, or the data point contained no values. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 seconds metric_type: counter title: sf.org.numDatapointsDroppedInvalid @@ -1393,8 +1422,8 @@ sf.org.numDatapointsDroppedInvalidByToken: brief: Number of data points dropped because they didn't follow documented guidelines for data points. description: | Number of data points for a specific access token that are dropped because they didn't follow documented guidelines for data points. For example, the metric name was too long, the metric name included unsupported characters, or the data point contained no values. - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 10 seconds + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 10 seconds metric_type: counter title: sf.org.numDatapointsDroppedInvalidByToken @@ -1403,8 +1432,8 @@ sf.org.numDatapointsDroppedInTimeout: description: | Number of data points Observability Cloud didn't attempt to create because your account was throttled or limited in the previous few seconds and creation was very unlikely to succeed. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 seconds metric_type: counter title: sf.org.numDatapointsDroppedInTimeout @@ -1424,8 +1453,8 @@ sf.org.numDatapointsDroppedThrottle: data points for both existing and new MTS. If Infrastructure Monitoring is throttling your organization, it isn't keeping any of your data. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.numDatapointsDroppedThrottle @@ -1441,8 +1470,8 @@ sf.org.numDatapointsDroppedThrottleByToken: The sum of all the values might be less than the value of `sf.org.numDatapointsDroppedThrottle`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId, tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.numDatapointsDroppedThrottleByToken @@ -1466,7 +1495,7 @@ sf.org.numDatapointsReceivedByToken: You can have up to three MTS for this metric. To learn more, see [Metrics with values for each metric type](#metrics-with-values-for-each-metric-type). The sum of all the values might be less than the value of `sf.org.numDatapointsReceived`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Data resolution: 1 second + * Data resolution: 1 second metric_type: counter title: sf.org.numDatapointsReceivedByToken @@ -1475,8 +1504,8 @@ sf.org.numDimensionObjectsCreated: description: | Total number of dimensions created. - * Dimension(s): `orgId` - * Data resolution: 10 seconds + * Dimension(s): `orgId` + * Data resolution: 10 seconds metric_type: gauge title: sf.org.numDimensionObjectsCreated @@ -1515,8 +1544,8 @@ sf.org.numEventTimeSeriesCreated: description: | Total number of event time series (ETS) created. For MTS values, see `sf.org.numMetricTimeSeriesCreated`. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.numEventTimeSeriesCreated @@ -1528,8 +1557,8 @@ sf.org.numEventTimeSeriesCreatedByToken: The sum of all the values might be less than the value of `sf.org.numEventTimeSeriesCreated`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId, tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.numEventTimeSeriesCreatedByToken @@ -1539,8 +1568,8 @@ sf.org.numEventsDroppedThrottle: Number of custom events you sent to Infrastructure Monitoring but that Infrastructure Monitoring didn't accept, because your organization exceeded its per-minute limit. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.numEventsDroppedThrottle @@ -1553,8 +1582,8 @@ sf.org.numEventsDroppedThrottleByToken: The sum of all the values might be less than the value of `sf.org.numEventsDroppedThrottle`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId, tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.numEventsDroppedThrottleByToken @@ -1576,8 +1605,8 @@ sf.org.numHighResolutionMetrics: description: | Number of high resolution metrics monitored by Infrastructure Monitoring - * Dimension(s): `orgId` - * Data resolution: 10 minutes + * Dimension(s): `orgId` + * Data resolution: 10 minutes metric_type: counter title: sf.org.numHighResolutionMetrics @@ -1589,8 +1618,8 @@ sf.org.numHighResolutionMetricsByToken: The sum of all the values might be less than the value of `sf.org.numHighResolutionMetrics`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 10 minutes + * Dimension(s): `orgId, tokenId` + * Data resolution: 10 minutes metric_type: counter title: sf.org.numHighResolutionMetricsByToken @@ -1645,8 +1674,8 @@ sf.org.numLimitedEventTimeSeriesCreateCalls: Number of event time series (ETS) that Infrastructure Monitoring was unable to create because you exceeded the maximum number of ETS allowed. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.numLimitedEventTimeSeriesCreateCalls @@ -1659,8 +1688,8 @@ sf.org.numLimitedEventTimeSeriesCreateCallsByToken: The sum of all the values might be less than the value of `sf.org.numLimitedEventTimeSeriesCreateCalls`. To learn more, [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `category`, `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `category`, `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.numLimitedEventTimeSeriesCreateCallsByToken @@ -1669,8 +1698,8 @@ sf.org.numLimitedMetricTimeSeriesCreateCalls: description: | Number of metric time series (MTS) not created because your account reached a category limit (or subscription limit). - * Dimension(s): `category`, `orgId` - * Data resolution: 10 seconds + * Dimension(s): `category`, `orgId` + * Data resolution: 10 seconds metric_type: counter title: sf.org.numLimitedMetricTimeSeriesCreateCalls @@ -1685,8 +1714,8 @@ sf.org.numLimitedMetricTimeSeriesCreateCallsByCategoryType: named `category` with a value of "counter", "cumulative_counter", or "gauge". To learn more, see [Metrics with values for each metric type](#metrics-with-values-for-each-metric-type). - * Dimension(s): `categoryType`, `orgId` - * Data resolution: 10 seconds + * Dimension(s): `categoryType`, `orgId` + * Data resolution: 10 seconds metric_type: counter title: sf.org.numLimitedMetricTimeSeriesCreateCallsByCategoryType @@ -1701,8 +1730,8 @@ sf.org.numLimitedMetricTimeSeriesCreateCallsByCategoryTypeByToken: named `category` with a value of "counter", "cumulative_counter", or "gauge". To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `categoryType`, `orgId`, `tokenId` - * Data resolution: 10 seconds + * Dimension(s): `categoryType`, `orgId`, `tokenId` + * Data resolution: 10 seconds metric_type: counter title: sf.org.numLimitedMetricTimeSeriesCreateCallsByCategoryTypeByToken @@ -1722,7 +1751,7 @@ sf.org.numLimitedMetricTimeSeriesCreateCallsByToken: The sum of all the values might be less than the value of `sf.org.numLimitedMetricTimeSeriesCreateCalls`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `category, `orgId`, `tokenId` + * Dimension(s): `category, orgId, tokenId` * Data resolution: 1 second metric_type: counter title: sf.org.numLimitedMetricTimeSeriesCreateCallsByToken @@ -1781,8 +1810,8 @@ sf.org.numMetricTimeSeriesCreated: is sent with a dimension named `category` with a value of "counter", "cumulative_counter", or "gauge". To learn more, see [Metrics with values for each metric type](#metrics-with-values-for-each-metric-type). - * Dimension(s): `category`, `orgId` - * Data resolution: 5 seconds + * Dimension(s): `category`, `orgId` + * Data resolution: 5 seconds metric_type: counter title: sf.org.numMetricTimeSeriesCreated @@ -1797,8 +1826,8 @@ sf.org.numMetricTimeSeriesCreatedByToken: The sum of all the values might be less than the value of `sf.org.numMetricTimeSeriesCreated`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `category`, `orgId`, `tokenId` - * Data resolution: 5 seconds + * Dimension(s): `category`, `orgId`, `tokenId` + * Data resolution: 5 seconds metric_type: counter title: sf.org.numMetricTimeSeriesCreatedByToken @@ -1816,8 +1845,8 @@ sf.org.numNewDimensions: brief: Number of new dimensions that were created description: | The number of new dimensions (key:value pairs) created. + * Data resolution: 5 minutes - metric_type: counter title: sf.org.numNewDimensions @@ -1828,8 +1857,8 @@ sf.org.numNewDimensionsByName: (key from key:value pair). Only the top 100 dimension names (by number of dimensions created) are included, in addition to dimension name `sf_metric`, which is always included. - * Data resolution: 5 minutes - + + * Data resolution: 5 minutes metric_type: counter title: sf.org.numNewDimensionsByName @@ -1849,8 +1878,8 @@ sf.org.numPropertyLimitedMetricTimeSeriesCreateCalls: Number of metric time series (MTS) Infrastructure Monitoring was unable to create because you reached your maximum number of unique dimension names. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.numPropertyLimitedMetricTimeSeriesCreateCalls @@ -1863,8 +1892,8 @@ sf.org.numPropertyLimitedMetricTimeSeriesCreateCallsByToken: The sum of all the values might be less than the value of `sf.org.numPropertyLimitedMetricTimeSeriesCreateCalls`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId`, `tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.numPropertyLimitedMetricTimeSeriesCreateCallsByToken @@ -1876,8 +1905,8 @@ sf.org.numResourceMetrics: The `resourceType` dimension indicates whether the value represents hosts or containers. - * Dimension(s): `orgId`, `resourceType` - * Data resolution: 10 minutes + * Dimension(s): `orgId`, `resourceType` + * Data resolution: 10 minutes metric_type: gauge title: sf.org.numResourceMetrics @@ -1890,7 +1919,6 @@ sf.org.numResourceMetricsbyToken: * Dimension(s): `orgId`, `resourceType`, `tokenId` * Data resolution: 10 minutes - metric_type: gauge title: sf.org.numResourceMetricsbyToken @@ -1900,8 +1928,8 @@ sf.org.numResourcesMonitored: Number of hosts or containers that Infrastructure Monitoring is currently monitoring. The `resourceType` dimension indicates whether the value represents hosts or containers. - * Dimension(s): `orgId`, `resourceType` - * Data resolution: 10 minutes + * Dimension(s): `orgId`, `resourceType` + * Data resolution: 10 minutes metric_type: counter title: sf.org.numResourcesMonitored @@ -1914,8 +1942,8 @@ sf.org.numResourcesMonitoredByToken: The sum of all the values might be less than the value of `sf.org.numResourcesMonitored`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `orgId`, `resourceType`, `tokenId` - * Data resolution: 10 minutes + * Dimension(s): `orgId`, `resourceType`, tokenId` + * Data resolution: 10 minutes metric_type: counter title: sf.org.numResourcesMonitoredByToken @@ -1924,8 +1952,8 @@ sf.org.numRestCalls: description: | Number of REST calls made to the Infrastructure Monitoring API. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second" metric_type: counter title: sf.org.numRestCalls @@ -1935,8 +1963,8 @@ sf.org.numRestCallsThrottled: Number of REST calls you made to the Infrastructure Monitoring API that were not accepted by Infrastructure Monitoring, because your organization significantly exceeded its per-minute limit. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.numRestCallsThrottled @@ -1950,8 +1978,8 @@ sf.org.numRestCallsThrottledByToken: The sum of all the values might be less than the value of `sf.org.numRestCallsThrottled`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId, tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.numRestCallsThrottledByToken @@ -1962,8 +1990,8 @@ sf.org.numThrottledEventTimeSeriesCreateCalls: Total number of event time series (ETS) that Infrastructure Monitoring was unable to create, because you significantly exceeded your per-minute event creation limit. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.numThrottledEventTimeSeriesCreateCalls @@ -1977,8 +2005,8 @@ sf.org.numThrottledEventTimeSeriesCreateCallsByToken: The sum of all the values might be less than the value of `sf.org.numThrottledEventTimeSeriesCreateCalls`. To learn more, see [Metrics for values by token](#metrics-for-values-by-token). - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId, tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.numThrottledEventTimeSeriesCreateCallsByToken @@ -1989,8 +2017,8 @@ sf.org.numThrottledMetricTimeSeriesCreateCalls: Number of metric time series (MTS) that Infrastructure Monitoring was unable to create because you significantly exceeded your per-minute or per-hour MTS creation limit. - * Dimension(s): `orgId` - * Data resolution: 1 second + * Dimension(s): `orgId` + * Data resolution: 1 second metric_type: counter title: sf.org.numThrottledMetricTimeSeriesCreateCalls @@ -2004,8 +2032,8 @@ sf.org.numThrottledMetricTimeSeriesCreateCallsByToken: The sum of all the values might be less than the value of `sf.org.numThrottledMetricTimeSeriesCreateCalls`. To learn more, see [Metrics for values by token](#metrics for values by token). - * Dimension(s): `orgId`, `tokenId` - * Data resolution: 1 second + * Dimension(s): `orgId, tokenId` + * Data resolution: 1 second metric_type: counter title: sf.org.numThrottledMetricTimeSeriesCreateCallsByToken @@ -2013,7 +2041,7 @@ sf.org.numUniqueNamesInNewDimensions: brief: Number of unique dimension names that were created description: | The number of unique dimension names (keys) created in all new dimensions - * Data resolution: 5 minutes + * Data resolution: 5 minutes metric_type: counter title: sf.org.numUniqueNamesInNewDimensions @@ -2024,7 +2052,7 @@ sf.org.subscription.activeTimeSeries: The number of active MTS is the total number of MTS that have received at least one data point within a moving window of the last 25 hours. - * Data resolution: 15 minutes + * Data resolution: 15 minutes metric_type: gauge title: sf.org.subscription.activeTimeSeries @@ -2034,7 +2062,7 @@ sf.org.subscription.datapointsPerMinute: Maximum number of data points per minute (DPM) that Infrastructure Monitoring will process and store. - * Data resolution: 15 minutes + * Data resolution: 15 minutes metric_type: gauge title: sf.org.subscription.datapointsPerMinute @@ -2044,7 +2072,7 @@ sf.org.subscription.containers: Number of containers included in the subscription. - * Data resolution: 10 seconds + * Data resolution: 10 seconds metric_type: gauge title: sf.org.subscription.containers @@ -2053,7 +2081,7 @@ sf.org.subscription.customMetrics: description: | Number of custom metric time series (MTS) included in the subscription. - * Data resolution: 10 seconds + * Data resolution: 10 seconds metric_type: gauge title: sf.org.subscription.customMetrics @@ -2062,7 +2090,7 @@ sf.org.subscription.highResolutionMetrics: description: | Number of high resolution metric time series (MTS) included in the subscription. - * Data resolution: 10 seconds + * Data resolution: 10 seconds metric_type: gauge title: sf.org.subscription.highResolutionMetrics @@ -2071,7 +2099,7 @@ sf.org.subscription.hosts: description: | Number of hosts included in the subscription. - * Data resolution: 10 seconds + * Data resolution: 10 seconds metric_type: gauge title: sf.org.subscription.hosts @@ -2080,7 +2108,7 @@ sf.org.subscription.function: description: | Number of serverless functions included in the subscription. - * Data resolution: 10 seconds + * Data resolution: 10 seconds metric_type: gauge title: sf.org.subscription.function @@ -2096,9 +2124,9 @@ sf.org.rum.numSpansReceived: description: | Number of spans received. - * Dimension: `orgId` - * Data resolution: 10 seconds - metric_type: counter + * Dimension: orgId + * Data resolution: 10 seconds + metric_type: Counter title: sf.org.rum.numSpansReceived sf.org.rum.numSpansReceivedByToken: @@ -2108,19 +2136,20 @@ sf.org.rum.numSpansReceivedByToken: The number of spans Splunk RUM received for a specific access token after filtering and throttling. - * Dimension: `orgId`, `tokenId` + * Dimension: orgId, tokenId * Data resolution: 10 seconds - metric_type: counter + metric_type: Counter title: sf.org.rum.numSpansReceivedByToken + sf.org.rum.numAddSpansCalls: brief: The number of calls to the `/v1/rum?auth={AUTH_TOKEN}` endpoint. description: | The number of calls to the `/v1/rum?auth={AUTH_TOKEN}` endpoint. - * Dimension: `orgId` - * Data resolution: 10 seconds - metric_type: counter + * Dimension: orgId + * Data resolution: 10 seconds + metric_type: Counter title: sf.org.rum.numAddSpansCalls sf.org.rum.numAddSpansCallsByToken: @@ -2128,9 +2157,9 @@ sf.org.rum.numAddSpansCallsByToken: description: | The number of calls to the `/v1/rum?auth={AUTH_TOKEN}` endpoint for a specific access token. - * Dimension: `orgId`, `tokenId` - * Data resolution: 10 seconds - metric_type: counter + * Dimension: orgId, tokenId + * Data resolution: 10 seconds + metric_type: Counter title: sf.org.rum.numAddSpansCalls sf.org.rum.numSpansDroppedInvalid: @@ -2138,9 +2167,9 @@ sf.org.rum.numSpansDroppedInvalid: description: | Number of spans dropped because they were invalid. Some of the reasons that spans are invalid are: spans are too large, missing required tags and invalid Trace IDs. Look through the reason column in the data table to see details for specific spans. - * Dimension: `orgId`, `reason` - * Data resolution: 10 seconds - metric_type: counter + * Dimension: orgId, reason, + * Data resolution: 10 seconds + metric_type: Counter title: sf.org.rum.numSpansDroppedInvalid sf.org.rum.numSpansDroppedInvalidByToken: @@ -2150,10 +2179,10 @@ sf.org.rum.numSpansDroppedInvalidByToken: The number of invalid spans Splunk RUM receives for a specific access token. Some of the reasons that spans are invalid are: spans are too large, missing required tags and invalid Trace IDs. Look through the reason column in the data table to see details for specific spans. - * Dimensions: `orgId`, `tokenId`, `reason` - * Data resolution: 10 seconds + * Dimensions: orgId, tokenId, reason + * Data resolution: 10 seconds - metric_type: counter + metric_type: Counter title: sf.org.rum.numSpansDroppedInvalidByToken sf.org.rum.numSpansDroppedThrottle: @@ -2161,9 +2190,9 @@ sf.org.rum.numSpansDroppedThrottle: description: | The number of spans Splunk RUM dropped after you exceeded the allowed ingest volume. - * Dimension: `orgId` - * Data resolution: 10 seconds - metric_type: counter + * Dimension: orgId + * Data resolution: 10 seconds + metric_type: Counter title: sf.org.rum.numSpansDroppedThrottle sf.org.rum.numSpansDroppedThrottleByToken: @@ -2171,9 +2200,9 @@ sf.org.rum.numSpansDroppedThrottleByToken: description: | The number of spans Splunk RUM dropped after you exceeded the allowed ingest volume for a specific access token. - * Dimension: `orgId`, `tokenId` - * Data resolution: 10 seconds - metric_type: counter + * Dimension: orgId, tokenId + * Data resolution: 10 seconds + metric_type: Counter title: sf.org.rum.numSpansDroppedThrottleByToken @@ -2182,9 +2211,9 @@ sf.org.rum.numSpanBytesReceived: description: | The bytes of all the spans accepted and counted by the metric numSpansReceived. - * Dimensions: `orgId` - * Data resolution: 10 seconds - metric_type: counter + * Dimensions: orgId + * Data resolution: 10 seconds + metric_type: Counter title: sf.org.rum.numSpanBytesReceived sf.org.rum.numSpanBytesReceivedByToken: @@ -2192,29 +2221,30 @@ sf.org.rum.numSpanBytesReceivedByToken: description: | The bytes of all the spans accepted and counted by the metric numSpansReceived by token. - * Dimensions: `orgId`, `tokenId` - * Data resolution: 10 seconds - metric_type: counter + * Dimensions: orgId, tokenId + * Data resolution: 10 seconds + metric_type: Counter title: sf.org.rum.numSpanBytesReceivedByToken + sf.org.rum.grossSpanBytesReceivedByToken: - brief: The bytes (uncompressed) of all the spans received and counted by the metric grossSpansReceived before throttling or filtering by token. - description: | - The bytes (uncompressed) of all the spans received and counted by the metric grossSpansReceived before throttling or filtering by token. + brief: The bytes (uncompressed) of all the spans received and counted by the metric grossSpansReceived before throttling or filtering by token. + description: | + The bytes (uncompressed) of all the spans received and counted by the metric grossSpansReceived before throttling or filtering by token. - * Dimensions: `orgId`, `tokenId` - * Data resolution: 10 seconds - metric_type: counter - title: sf.org.rum.grossSpanBytesReceivedByToken + * Dimensions: orgId, tokenId + * Data resolution: 10 seconds + metric_type: Counter + title: sf.org.rum.grossSpanBytesReceivedByToken sf.org.rum.numContentBytesReceived: brief: The volume of bytes Splunk RUM receives after filtering and throttling. description: | The volume of bytes Splunk RUM receives after filtering and throttling. - * Dimensions: `orgId` - * Data resolution: 10 seconds - metric_type: counter + * Dimensions: orgId + * Data resolution: 10 seconds + metric_type: Counter title: sf.org.rum.numContentBytesReceived sf.org.rum.grossContentBytesReceivedByToken: @@ -2222,29 +2252,19 @@ sf.org.rum.grossContentBytesReceivedByToken: description: | The possibly compressed wire size of the payloads before the payloads are decompressed and decoded by token. - * Dimensions: `orgId`, `tokenId` - * Data resolution: 10 seconds - metric_type: counter + * Dimensions: orgId, tokenId + * Data resolution: 10 seconds + metric_type: Counter title: sf.org.rum.grossContentBytesReceivedByToken + sf.org.rum.grossContentBytesReceived: brief: The possibly compressed wire size of the payloads before the payloads are decompressed and decoded. description: | The possibly compressed wire size of the payloads before the payloads are decompressed and decoded. - * Dimensions: `orgId` - * Data resolution: 10 seconds + * Dimensions: orgId + * Data resolution: 10 seconds - metric_type: counter + metric_type: Counter title: sf.org.rum.grossContentBytesReceived - - sf.org.rum.grossReplayContentBytesReceived: - brief: The possibly compressed wire size of the payloads before the payloads are decompressed and decoded, per RUM session replay. - description: | - The possibly compressed wire size of the payloads before the payloads are decompressed and decoded, per RUM session replay. - - * Dimensions: `orgId` - * Data resolution: 10 seconds - - metric_type: counter - title: sf.org.rum.grossReplayContentBytesReceived diff --git a/solr/SMART_AGENT_MONITOR.md b/solr/SMART_AGENT_MONITOR.md index b8f8db91b..13dee5625 100644 --- a/solr/SMART_AGENT_MONITOR.md +++ b/solr/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/solr`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/solr`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -54,7 +54,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -140,15 +140,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/spark/SMART_AGENT_MONITOR.md b/spark/SMART_AGENT_MONITOR.md index 78d7ef3d3..425df4988 100644 --- a/spark/SMART_AGENT_MONITOR.md +++ b/spark/SMART_AGENT_MONITOR.md @@ -12,34 +12,36 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/spark`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/spark`. +Below is an overview of that monitor. ### Smart Agent Monitor -Collects metrics about a Spark cluster using the [collectd Spark Python +This integration collects metrics about a Spark cluster using the [collectd Spark Python plugin](https://github.com/signalfx/collectd-spark). That plugin collects metrics from Spark cluster and instances by hitting endpoints specified in Spark's [Monitoring and Instrumentation documentation](https://spark.apache.org/docs/latest/monitoring.html) under `REST API` and `Metrics`. -We currently only support cluster modes Standalone, Mesos, and Hadoop Yarn -via HTTP endpoints. +The following cluster modes are supported only through HTTP endpoints: +- Standalone +- Mesos +- Hadoop YARN -You have to specify distinct monitor configurations and discovery rules for +You must specify distinct monitor configurations and discovery rules for master and worker processes. For the master configuration, set `isMaster` to true. -When running Spark on Apache Hadoop / Yarn, this integration is only capable -of reporting application metrics from the master node. Please use the +When running Spark on Apache Hadoop / YARN, this integration is only capable +of reporting application metrics from the master node. Use the collectd/hadoop monitor to report on the health of the cluster. ### Example config: -An example configuration for monitoring applications on Yarn +An example configuration for monitoring applications on YARN ```yaml monitors: - type: collectd/spark @@ -63,7 +65,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -191,16 +193,16 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. ## Dimensions diff --git a/statsd/SMART_AGENT_MONITOR.md b/statsd/SMART_AGENT_MONITOR.md index 90d746ac7..224094bd5 100644 --- a/statsd/SMART_AGENT_MONITOR.md +++ b/statsd/SMART_AGENT_MONITOR.md @@ -12,26 +12,26 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `statsd`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `statsd`. +Below is an overview of that monitor. ### Smart Agent Monitor This monitor will receive and aggergate Statsd metrics and convert them to -datapoints. It listens on a configured address and port in order to +data points. It listens on a configured address and port in order to receive the statsd metrics. Note that this monitor does not support statsd extensions such as tags. -The monitor supports the `Counter`, `Timer`, `Gauge` and `Set` types which -are dispatched as the SignalFx types `counter`, `gauge`, `gauge` and +The monitor supports the `Counter`, `Timer`, `Gauge`, and `Set` types, which +are dispatched as the SignalFx types `counter`, `gauge`, `gauge`, and `gauge` respectively. -**Note that datapoints will get a `host` dimension of the current host that +**Note:** Data points get a `host` dimension of the current host that the agent is running on, not the host from which the statsd metric was sent. For this reason, it is recommended to send statsd metrics to a local agent instance. If you don't want the `host` dimension, you can set -`disableHostDimensions: true` on the monitor configuration** +`disableHostDimensions: true` on the monitor configuration. #### Verifying installation @@ -43,11 +43,14 @@ in SignalFx that the metric arrived (assuming the default config). $ echo "statsd.test:1|g" | nc -w 1 -u 127.0.0.1 8125 ``` +For Kubernetes environments, use the `status.hostIP` environment variable to verify the installation. This environment variable +is the IP address of the node where the pod is running. See [Expose Pod Information to Containers Through Files](https://kubernetes.io/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information/). + #### Adding dimensions to StatsD metrics The StatsD monitor can parse keywords from a statsd metric name by a set of -converters that was configured by user. +converters previously configured by the user. ``` converters: @@ -55,11 +58,11 @@ converters: ... ``` -This converter will parse `traffic`, `mesh`, `service` and `action` as dimensions +This parses `traffic`, `mesh`, `service`, and `action` as dimensions from a metric name `cluster.cds_egress_ecommerce-demo-mesh_gateway-vn_tcp_8080.update_success`. -If a section has only a pair of brackets without a name, it will not capture a dimension. +If a section has only a pair of brackets without a name, it does not capture a dimension. -When multiple converters were provided, a metric will be converted by the first converter with a +If multiple converters are provided, a metric is converted by the first converter with a matching pattern to the metric name. @@ -73,11 +76,11 @@ converters: metricName: "{traffic}.{action}" ``` -The metrics which match to the given pattern will be reported to SignalFx as `{traffic}.{action}`. -For instance, metric `cluster.cds_egress_ecommerce-demo-mesh_gateway-vn_tcp_8080.update_success` -will be reported as `egress.update_success`. +The metrics that match to the given pattern are reported to SignalFx as `{traffic}.{action}`. +For instance, metric name `cluster.cds_egress_ecommerce-demo-mesh_gateway-vn_tcp_8080.update_success` +is reported as `egress.update_success`. -`metricName` is required for a converter configuration. A converter will be +`metricName` is required for a converter configuration. A converter is disabled if `metricName` is not provided. @@ -93,7 +96,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | diff --git a/zookeeper/SMART_AGENT_MONITOR.md b/zookeeper/SMART_AGENT_MONITOR.md index 12d9a237e..71d50a175 100644 --- a/zookeeper/SMART_AGENT_MONITOR.md +++ b/zookeeper/SMART_AGENT_MONITOR.md @@ -12,8 +12,8 @@ configuration instructions below. ## Description -**This integration primarily consists of the Smart Agent monitor `collectd/zookeeper`. -Below is an overview of that monitor.** +This integration primarily consists of the Smart Agent monitor `collectd/zookeeper`. +Below is an overview of that monitor. ### Smart Agent Monitor @@ -36,7 +36,7 @@ monitors: # All monitor config goes under this key ``` **For a list of monitor options that are common to all monitors, see [Common -Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).** +Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).** | Config option | Required | Type | Description | @@ -72,6 +72,14 @@ These are the metrics available for this integration. - ***`gauge.zk_watch_count`*** (*gauge*)
Number of watches placed on Z-Nodes on a ZooKeeper server - ***`gauge.zk_znode_count`*** (*gauge*)
Number of z-nodes that a ZooKeeper server has in its data tree +#### Group leader +All of the following metrics are part of the `leader` metric group. All of +the non-default metrics below can be turned on by adding `leader` to the +monitor config option `extraGroups`: + - `gauge.zk_followers` (*gauge*)
Number of followers within the ensemble. Only exposed by the leader. + - `gauge.zk_pending_syncs` (*gauge*)
Number of pending syncs from the followers. Only exposed by the leader. + - `gauge.zk_synced_followers` (*gauge*)
Number of synced followers. Only exposed by the leader. + ### Non-default metrics (version 4.7.0+) **The following information applies to the agent version 4.7.0+ that has @@ -87,15 +95,15 @@ monitors` after configuring this monitor in a running agent instance. ### Legacy non-default metrics (version < 4.7.0) -**The following information only applies to agent version older than 4.7.0. If +**The following information only applies to agent versions prior to 4.7.0. If you have a newer agent and have set `enableBuiltInFiltering: true` at the top level of your agent config, see the section above. See upgrade instructions in -[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).** +[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).** If you have a reference to the `whitelist.json` in your agent's top-level `metricsToExclude` config option, and you want to emit metrics that are not in -that whitelist, then you need to add an item to the top-level -`metricsToInclude` config option to override that whitelist (see [Inclusion -filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just +that allow list, then you need to add an item to the top-level +`metricsToInclude` config option to override that allow list (see [Inclusion +filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just copy the whitelist.json, modify it, and reference that in `metricsToExclude`. diff --git a/zookeeper/metrics.yaml b/zookeeper/metrics.yaml index 9d7c7e041..100ca7a21 100644 --- a/zookeeper/metrics.yaml +++ b/zookeeper/metrics.yaml @@ -48,6 +48,14 @@ gauge.zk_ephemerals_count: monitor: collectd/zookeeper title: gauge.zk_ephemerals_count +gauge.zk_followers: + brief: Number of followers within the ensemble + custom: true + description: Number of followers within the ensemble. Only exposed by the leader. + metric_type: gauge + monitor: collectd/zookeeper + title: gauge.zk_followers + gauge.zk_is_leader: brief: 1 if the node is a leader, 0 if the node is a follower custom: true @@ -104,6 +112,14 @@ gauge.zk_outstanding_requests: monitor: collectd/zookeeper title: gauge.zk_outstanding_requests +gauge.zk_pending_syncs: + brief: Number of pending syncs from the followers + custom: true + description: Number of pending syncs from the followers. Only exposed by the leader. + metric_type: gauge + monitor: collectd/zookeeper + title: gauge.zk_pending_syncs + gauge.zk_service_health: brief: 1 if server is running, otherwise 0 custom: true @@ -112,6 +128,14 @@ gauge.zk_service_health: monitor: collectd/zookeeper title: gauge.zk_service_health +gauge.zk_synced_followers: + brief: Number of synced followers + custom: true + description: Number of synced followers. Only exposed by the leader. + metric_type: gauge + monitor: collectd/zookeeper + title: gauge.zk_synced_followers + gauge.zk_watch_count: brief: Number of watches placed on Z-Nodes on a ZooKeeper server custom: false