Skip to content

Commit

Permalink
Revert "Merge to master (#1044)"
Browse files Browse the repository at this point in the history
This reverts commit 287a4f2.
  • Loading branch information
aurbiztondo-splunk authored Jan 9, 2024
1 parent 287a4f2 commit ee869f7
Show file tree
Hide file tree
Showing 169 changed files with 4,908 additions and 3,125 deletions.
16 changes: 8 additions & 8 deletions activemq/SMART_AGENT_MONITOR.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ configuration instructions below.

## Description

**This integration primarily consists of the Smart Agent monitor `collectd/activemq`.
Below is an overview of that monitor.**
This integration primarily consists of the Smart Agent monitor `collectd/activemq`.
Below is an overview of that monitor.

### Smart Agent Monitor

Expand Down Expand Up @@ -55,7 +55,7 @@ monitors: # All monitor config goes under this key
```
**For a list of monitor options that are common to all monitors, see [Common
Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).**
Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).**
| Config option | Required | Type | Description |
Expand Down Expand Up @@ -166,15 +166,15 @@ monitors` after configuring this monitor in a running agent instance.

### Legacy non-default metrics (version < 4.7.0)

**The following information only applies to agent version older than 4.7.0. If
**The following information only applies to agent versions prior to 4.7.0. If
you have a newer agent and have set `enableBuiltInFiltering: true` at the top
level of your agent config, see the section above. See upgrade instructions in
[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).**
[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).**

If you have a reference to the `whitelist.json` in your agent's top-level
`metricsToExclude` config option, and you want to emit metrics that are not in
that whitelist, then you need to add an item to the top-level
`metricsToInclude` config option to override that whitelist (see [Inclusion
filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just
that allow list, then you need to add an item to the top-level
`metricsToInclude` config option to override that allow list (see [Inclusion
filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just
copy the whitelist.json, modify it, and reference that in `metricsToExclude`.

91 changes: 63 additions & 28 deletions cassandra/SMART_AGENT_MONITOR.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,15 @@ configuration instructions below.

## Description

**This integration primarily consists of the Smart Agent monitor `collectd/cassandra`.
Below is an overview of that monitor.**
This integration primarily consists of the Smart Agent monitor `collectd/cassandra`.
Below is an overview of that monitor.

### Smart Agent Monitor


Monitors Cassandra using the Collectd GenericJMX plugin. This is
essentially a wrapper around the
[collectd-genericjmx](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./collectd-genericjmx.md)[](sfx_link:java) monitor that comes with a
[collectd-genericjmx](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./collectd-genericjmx.md)[](sfx_link:java) monitor that comes with a
set of predefined MBean definitions that a standard Cassandra deployment
will expose.

Expand All @@ -47,7 +47,7 @@ monitors: # All monitor config goes under this key
```

**For a list of monitor options that are common to all monitors, see [Common
Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).**
Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).**


| Config option | Required | Type | Description |
Expand Down Expand Up @@ -97,6 +97,18 @@ Metrics that are categorized as

These are the metrics available for this integration.

- `counter.cassandra.ClientRequest.CASRead.Latency.Count` (*cumulative*)<br> Count of transactional read operations since server start.
- `counter.cassandra.ClientRequest.CASRead.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing client transactional read requests.

It can be devided by `counter.cassandra.ClientRequest.CASRead.Latency.Count`
to find the real time transactional read latency.

- `counter.cassandra.ClientRequest.CASWrite.Latency.Count` (*cumulative*)<br> Count of transactional write operations since server start.
- `counter.cassandra.ClientRequest.CASWrite.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing client transactional write requests.

It can be devided by `counter.cassandra.ClientRequest.CASWrite.Latency.Count`
to find the real time transactional write latency.

- ***`counter.cassandra.ClientRequest.RangeSlice.Latency.Count`*** (*cumulative*)<br> Count of range slice operations since server start. This typically indicates a server overload condition.

If this value is increasing across the cluster then the cluster is too small for the application range slice load.
Expand All @@ -114,14 +126,15 @@ These are the metrics available for this integration.
- one or more clients are directing more load to this server than the others
- the server is experiencing hardware or software issues and may require maintenance.

- `counter.cassandra.ClientRequest.RangeSlice.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing range slice requests.
- ***`counter.cassandra.ClientRequest.RangeSlice.Unavailables.Count`*** (*cumulative*)<br> Count of range slice unavailables since server start. A non-zero value
means that insufficient replicas were available to fulfil a range slice
request at the requested consistency level.

This typically means that one or more nodes are down. To fix this condition,
any down nodes must be restarted, or removed from the cluster.

- ***`counter.cassandra.ClientRequest.Read.Latency.Count`*** (*cumulative*)<br> Count of read operations since server start
- ***`counter.cassandra.ClientRequest.Read.Latency.Count`*** (*cumulative*)<br> Count of read operations since server start.
- ***`counter.cassandra.ClientRequest.Read.Timeouts.Count`*** (*cumulative*)<br> Count of read timeouts since server start. This typically indicates a server overload condition.

If this value is increasing across the cluster then the cluster is too small for the application read load.
Expand All @@ -130,6 +143,11 @@ These are the metrics available for this integration.
- one or more clients are directing more load to this server than the others
- the server is experiencing hardware or software issues and may require maintenance.

- `counter.cassandra.ClientRequest.Read.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing client read requests.

It can be devided by `counter.cassandra.ClientRequest.Read.Latency.Count`
to find the real time read latency.

- ***`counter.cassandra.ClientRequest.Read.Unavailables.Count`*** (*cumulative*)<br> Count of read unavailables since server start. A non-zero value means
that insufficient replicas were available to fulfil a read request at
the requested consistency level. This typically means that one or more
Expand All @@ -145,6 +163,11 @@ These are the metrics available for this integration.
- one or more clients are directing more load to this server than the others
- the server is experiencing hardware or software issues and may require maintenance.

- `counter.cassandra.ClientRequest.Write.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing client write requests.

It can be devided by `counter.cassandra.ClientRequest.Write.Latency.Count`
to find the real time write latency.

- ***`counter.cassandra.ClientRequest.Write.Unavailables.Count`*** (*cumulative*)<br> Count of write unavailables since server start. A non-zero value means
that insufficient replicas were available to fulfil a write request at
the requested consistency level.
Expand All @@ -157,6 +180,34 @@ These are the metrics available for this integration.
not increase steadily over time then the node may be experiencing
problems completing compaction operations.

- `counter.cassandra.Storage.Exceptions.Count` (*cumulative*)<br> Number of internal exceptions caught. Under normal exceptions this should be zero.

- ***`counter.cassandra.Storage.Load.Count`*** (*cumulative*)<br> Storage used for Cassandra data in bytes. Use this metric to see how much storage is being used for data by a Cassandra node.

The value of this metric is influenced by:
- Total data stored into the database
- compaction behavior

- `counter.cassandra.Storage.TotalHints.Count` (*cumulative*)<br> Total hints since node start. Indicates that write operations cannot be
delivered to a node, usually because a node is down. If this value is
increasing and all nodes are up then there may be some connectivity
issue between nodes in the cluster.

- ***`counter.cassandra.Storage.TotalHintsInProgress.Count`*** (*cumulative*)<br> Total pending hints. Indicates that write operations cannot be
delivered to a node, usually because a node is down. If this value is
increasing and all nodes are up then there may be some connectivity
issue between nodes in the cluster.

- `gauge.cassandra.ClientRequest.CASRead.Latency.50thPercentile` (*gauge*)<br> 50th percentile (median) of Cassandra transactional read latency.

- `gauge.cassandra.ClientRequest.CASRead.Latency.99thPercentile` (*gauge*)<br> 99th percentile of Cassandra transactional read latency.

- `gauge.cassandra.ClientRequest.CASRead.Latency.Max` (*gauge*)<br> Maximum Cassandra transactional read latency.
- `gauge.cassandra.ClientRequest.CASWrite.Latency.50thPercentile` (*gauge*)<br> 50th percentile (median) of Cassandra transactional write latency.

- `gauge.cassandra.ClientRequest.CASWrite.Latency.99thPercentile` (*gauge*)<br> 99th percentile of Cassandra transactional write latency.

- `gauge.cassandra.ClientRequest.CASWrite.Latency.Max` (*gauge*)<br> Maximum Cassandra transactional write latency.
- `gauge.cassandra.ClientRequest.RangeSlice.Latency.50thPercentile` (*gauge*)<br> 50th percentile (median) of Cassandra range slice latency. This value
should be similar across all nodes in the cluster. If some nodes have higher
values than the rest of the cluster then they may have more connected clients
Expand All @@ -167,7 +218,7 @@ These are the metrics available for this integration.
the rest of the cluster then they may have more connected clients or may be
experiencing heavier than usual compaction load.

- `gauge.cassandra.ClientRequest.RangeSlice.Latency.Max` (*gauge*)<br> Maximum Cassandra range slice latency
- `gauge.cassandra.ClientRequest.RangeSlice.Latency.Max` (*gauge*)<br> Maximum Cassandra range slice latency.
- ***`gauge.cassandra.ClientRequest.Read.Latency.50thPercentile`*** (*gauge*)<br> 50th percentile (median) of Cassandra read latency. This value should
be similar across all nodes in the cluster. If some nodes have higher
values than the rest of the cluster then they may have more connected
Expand All @@ -178,7 +229,7 @@ These are the metrics available for this integration.
the rest of the cluster then they may have more connected clients or
may be experiencing heavier than usual compaction load.

- ***`gauge.cassandra.ClientRequest.Read.Latency.Max`*** (*gauge*)<br> Maximum Cassandra read latency
- ***`gauge.cassandra.ClientRequest.Read.Latency.Max`*** (*gauge*)<br> Maximum Cassandra read latency.
- ***`gauge.cassandra.ClientRequest.Write.Latency.50thPercentile`*** (*gauge*)<br> 50th percentile (median) of Cassandra write latency. This value should
be similar across all nodes in the cluster. If some nodes have higher
values than the rest of the cluster then they may have more connected
Expand All @@ -194,22 +245,6 @@ These are the metrics available for this integration.
continually increasing then the node may be experiencing problems
completing compaction operations.

- ***`gauge.cassandra.Storage.Load.Count`*** (*gauge*)<br> Storage used for Cassandra data in bytes. Use this metric to see how much storage is being used for data by a Cassandra node.

The value of this metric is influenced by:
- Total data stored into the database
- compaction behavior

- `gauge.cassandra.Storage.TotalHints.Count` (*gauge*)<br> Total hints since node start. Indicates that write operations cannot be
delivered to a node, usually because a node is down. If this value is
increasing and all nodes are up then there may be some connectivity
issue between nodes in the cluster.

- ***`gauge.cassandra.Storage.TotalHintsInProgress.Count`*** (*gauge*)<br> Total pending hints. Indicates that write operations cannot be
delivered to a node, usually because a node is down. If this value is
increasing and all nodes are up then there may be some connectivity
issue between nodes in the cluster.


#### Group jvm
All of the following metrics are part of the `jvm` metric group. All of
Expand Down Expand Up @@ -239,15 +274,15 @@ monitors` after configuring this monitor in a running agent instance.

### Legacy non-default metrics (version < 4.7.0)

**The following information only applies to agent version older than 4.7.0. If
**The following information only applies to agent versions prior to 4.7.0. If
you have a newer agent and have set `enableBuiltInFiltering: true` at the top
level of your agent config, see the section above. See upgrade instructions in
[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).**
[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).**

If you have a reference to the `whitelist.json` in your agent's top-level
`metricsToExclude` config option, and you want to emit metrics that are not in
that whitelist, then you need to add an item to the top-level
`metricsToInclude` config option to override that whitelist (see [Inclusion
filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just
that allow list, then you need to add an item to the top-level
`metricsToInclude` config option to override that allow list (see [Inclusion
filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just
copy the whitelist.json, modify it, and reference that in `metricsToExclude`.

Loading

0 comments on commit ee869f7

Please sign in to comment.