Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "Merge to master" #1045

Merged
merged 1 commit into from
Jan 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions activemq/SMART_AGENT_MONITOR.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ configuration instructions below.

## Description

**This integration primarily consists of the Smart Agent monitor `collectd/activemq`.
Below is an overview of that monitor.**
This integration primarily consists of the Smart Agent monitor `collectd/activemq`.
Below is an overview of that monitor.

### Smart Agent Monitor

Expand Down Expand Up @@ -55,7 +55,7 @@ monitors: # All monitor config goes under this key
```

**For a list of monitor options that are common to all monitors, see [Common
Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).**
Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).**


| Config option | Required | Type | Description |
Expand Down Expand Up @@ -166,15 +166,15 @@ monitors` after configuring this monitor in a running agent instance.

### Legacy non-default metrics (version < 4.7.0)

**The following information only applies to agent version older than 4.7.0. If
**The following information only applies to agent versions prior to 4.7.0. If
you have a newer agent and have set `enableBuiltInFiltering: true` at the top
level of your agent config, see the section above. See upgrade instructions in
[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).**
[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).**

If you have a reference to the `whitelist.json` in your agent's top-level
`metricsToExclude` config option, and you want to emit metrics that are not in
that whitelist, then you need to add an item to the top-level
`metricsToInclude` config option to override that whitelist (see [Inclusion
filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just
that allow list, then you need to add an item to the top-level
`metricsToInclude` config option to override that allow list (see [Inclusion
filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just
copy the whitelist.json, modify it, and reference that in `metricsToExclude`.

91 changes: 63 additions & 28 deletions cassandra/SMART_AGENT_MONITOR.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,15 @@ configuration instructions below.

## Description

**This integration primarily consists of the Smart Agent monitor `collectd/cassandra`.
Below is an overview of that monitor.**
This integration primarily consists of the Smart Agent monitor `collectd/cassandra`.
Below is an overview of that monitor.

### Smart Agent Monitor


Monitors Cassandra using the Collectd GenericJMX plugin. This is
essentially a wrapper around the
[collectd-genericjmx](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./collectd-genericjmx.md)[](sfx_link:java) monitor that comes with a
[collectd-genericjmx](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./collectd-genericjmx.md)[](sfx_link:java) monitor that comes with a
set of predefined MBean definitions that a standard Cassandra deployment
will expose.

Expand All @@ -47,7 +47,7 @@ monitors: # All monitor config goes under this key
```

**For a list of monitor options that are common to all monitors, see [Common
Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).**
Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).**


| Config option | Required | Type | Description |
Expand Down Expand Up @@ -97,6 +97,18 @@ Metrics that are categorized as

These are the metrics available for this integration.

- `counter.cassandra.ClientRequest.CASRead.Latency.Count` (*cumulative*)<br> Count of transactional read operations since server start.
- `counter.cassandra.ClientRequest.CASRead.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing client transactional read requests.

It can be devided by `counter.cassandra.ClientRequest.CASRead.Latency.Count`
to find the real time transactional read latency.

- `counter.cassandra.ClientRequest.CASWrite.Latency.Count` (*cumulative*)<br> Count of transactional write operations since server start.
- `counter.cassandra.ClientRequest.CASWrite.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing client transactional write requests.

It can be devided by `counter.cassandra.ClientRequest.CASWrite.Latency.Count`
to find the real time transactional write latency.

- ***`counter.cassandra.ClientRequest.RangeSlice.Latency.Count`*** (*cumulative*)<br> Count of range slice operations since server start. This typically indicates a server overload condition.

If this value is increasing across the cluster then the cluster is too small for the application range slice load.
Expand All @@ -114,14 +126,15 @@ These are the metrics available for this integration.
- one or more clients are directing more load to this server than the others
- the server is experiencing hardware or software issues and may require maintenance.

- `counter.cassandra.ClientRequest.RangeSlice.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing range slice requests.
- ***`counter.cassandra.ClientRequest.RangeSlice.Unavailables.Count`*** (*cumulative*)<br> Count of range slice unavailables since server start. A non-zero value
means that insufficient replicas were available to fulfil a range slice
request at the requested consistency level.

This typically means that one or more nodes are down. To fix this condition,
any down nodes must be restarted, or removed from the cluster.

- ***`counter.cassandra.ClientRequest.Read.Latency.Count`*** (*cumulative*)<br> Count of read operations since server start
- ***`counter.cassandra.ClientRequest.Read.Latency.Count`*** (*cumulative*)<br> Count of read operations since server start.
- ***`counter.cassandra.ClientRequest.Read.Timeouts.Count`*** (*cumulative*)<br> Count of read timeouts since server start. This typically indicates a server overload condition.

If this value is increasing across the cluster then the cluster is too small for the application read load.
Expand All @@ -130,6 +143,11 @@ These are the metrics available for this integration.
- one or more clients are directing more load to this server than the others
- the server is experiencing hardware or software issues and may require maintenance.

- `counter.cassandra.ClientRequest.Read.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing client read requests.

It can be devided by `counter.cassandra.ClientRequest.Read.Latency.Count`
to find the real time read latency.

- ***`counter.cassandra.ClientRequest.Read.Unavailables.Count`*** (*cumulative*)<br> Count of read unavailables since server start. A non-zero value means
that insufficient replicas were available to fulfil a read request at
the requested consistency level. This typically means that one or more
Expand All @@ -145,6 +163,11 @@ These are the metrics available for this integration.
- one or more clients are directing more load to this server than the others
- the server is experiencing hardware or software issues and may require maintenance.

- `counter.cassandra.ClientRequest.Write.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing client write requests.

It can be devided by `counter.cassandra.ClientRequest.Write.Latency.Count`
to find the real time write latency.

- ***`counter.cassandra.ClientRequest.Write.Unavailables.Count`*** (*cumulative*)<br> Count of write unavailables since server start. A non-zero value means
that insufficient replicas were available to fulfil a write request at
the requested consistency level.
Expand All @@ -157,6 +180,34 @@ These are the metrics available for this integration.
not increase steadily over time then the node may be experiencing
problems completing compaction operations.

- `counter.cassandra.Storage.Exceptions.Count` (*cumulative*)<br> Number of internal exceptions caught. Under normal exceptions this should be zero.

- ***`counter.cassandra.Storage.Load.Count`*** (*cumulative*)<br> Storage used for Cassandra data in bytes. Use this metric to see how much storage is being used for data by a Cassandra node.

The value of this metric is influenced by:
- Total data stored into the database
- compaction behavior

- `counter.cassandra.Storage.TotalHints.Count` (*cumulative*)<br> Total hints since node start. Indicates that write operations cannot be
delivered to a node, usually because a node is down. If this value is
increasing and all nodes are up then there may be some connectivity
issue between nodes in the cluster.

- ***`counter.cassandra.Storage.TotalHintsInProgress.Count`*** (*cumulative*)<br> Total pending hints. Indicates that write operations cannot be
delivered to a node, usually because a node is down. If this value is
increasing and all nodes are up then there may be some connectivity
issue between nodes in the cluster.

- `gauge.cassandra.ClientRequest.CASRead.Latency.50thPercentile` (*gauge*)<br> 50th percentile (median) of Cassandra transactional read latency.

- `gauge.cassandra.ClientRequest.CASRead.Latency.99thPercentile` (*gauge*)<br> 99th percentile of Cassandra transactional read latency.

- `gauge.cassandra.ClientRequest.CASRead.Latency.Max` (*gauge*)<br> Maximum Cassandra transactional read latency.
- `gauge.cassandra.ClientRequest.CASWrite.Latency.50thPercentile` (*gauge*)<br> 50th percentile (median) of Cassandra transactional write latency.

- `gauge.cassandra.ClientRequest.CASWrite.Latency.99thPercentile` (*gauge*)<br> 99th percentile of Cassandra transactional write latency.

- `gauge.cassandra.ClientRequest.CASWrite.Latency.Max` (*gauge*)<br> Maximum Cassandra transactional write latency.
- `gauge.cassandra.ClientRequest.RangeSlice.Latency.50thPercentile` (*gauge*)<br> 50th percentile (median) of Cassandra range slice latency. This value
should be similar across all nodes in the cluster. If some nodes have higher
values than the rest of the cluster then they may have more connected clients
Expand All @@ -167,7 +218,7 @@ These are the metrics available for this integration.
the rest of the cluster then they may have more connected clients or may be
experiencing heavier than usual compaction load.

- `gauge.cassandra.ClientRequest.RangeSlice.Latency.Max` (*gauge*)<br> Maximum Cassandra range slice latency
- `gauge.cassandra.ClientRequest.RangeSlice.Latency.Max` (*gauge*)<br> Maximum Cassandra range slice latency.
- ***`gauge.cassandra.ClientRequest.Read.Latency.50thPercentile`*** (*gauge*)<br> 50th percentile (median) of Cassandra read latency. This value should
be similar across all nodes in the cluster. If some nodes have higher
values than the rest of the cluster then they may have more connected
Expand All @@ -178,7 +229,7 @@ These are the metrics available for this integration.
the rest of the cluster then they may have more connected clients or
may be experiencing heavier than usual compaction load.

- ***`gauge.cassandra.ClientRequest.Read.Latency.Max`*** (*gauge*)<br> Maximum Cassandra read latency
- ***`gauge.cassandra.ClientRequest.Read.Latency.Max`*** (*gauge*)<br> Maximum Cassandra read latency.
- ***`gauge.cassandra.ClientRequest.Write.Latency.50thPercentile`*** (*gauge*)<br> 50th percentile (median) of Cassandra write latency. This value should
be similar across all nodes in the cluster. If some nodes have higher
values than the rest of the cluster then they may have more connected
Expand All @@ -194,22 +245,6 @@ These are the metrics available for this integration.
continually increasing then the node may be experiencing problems
completing compaction operations.

- ***`gauge.cassandra.Storage.Load.Count`*** (*gauge*)<br> Storage used for Cassandra data in bytes. Use this metric to see how much storage is being used for data by a Cassandra node.

The value of this metric is influenced by:
- Total data stored into the database
- compaction behavior

- `gauge.cassandra.Storage.TotalHints.Count` (*gauge*)<br> Total hints since node start. Indicates that write operations cannot be
delivered to a node, usually because a node is down. If this value is
increasing and all nodes are up then there may be some connectivity
issue between nodes in the cluster.

- ***`gauge.cassandra.Storage.TotalHintsInProgress.Count`*** (*gauge*)<br> Total pending hints. Indicates that write operations cannot be
delivered to a node, usually because a node is down. If this value is
increasing and all nodes are up then there may be some connectivity
issue between nodes in the cluster.


#### Group jvm
All of the following metrics are part of the `jvm` metric group. All of
Expand Down Expand Up @@ -239,15 +274,15 @@ monitors` after configuring this monitor in a running agent instance.

### Legacy non-default metrics (version < 4.7.0)

**The following information only applies to agent version older than 4.7.0. If
**The following information only applies to agent versions prior to 4.7.0. If
you have a newer agent and have set `enableBuiltInFiltering: true` at the top
level of your agent config, see the section above. See upgrade instructions in
[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).**
[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).**

If you have a reference to the `whitelist.json` in your agent's top-level
`metricsToExclude` config option, and you want to emit metrics that are not in
that whitelist, then you need to add an item to the top-level
`metricsToInclude` config option to override that whitelist (see [Inclusion
filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just
that allow list, then you need to add an item to the top-level
`metricsToInclude` config option to override that allow list (see [Inclusion
filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just
copy the whitelist.json, modify it, and reference that in `metricsToExclude`.

Loading
Loading