Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge to master #1044

Merged
merged 31 commits into from
Jan 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
141b42d
Merge main to release. Attempt 4 (#988)
mfoulds-splunk Oct 19, 2022
f9c23cf
jenkins: Release Smart Agent docs for release v4.21.1
Feb 7, 2023
26dcdea
OD4680 More Org metrics (#1022)
aurbiztondo-splunk May 29, 2023
d6730e9
Org metrics update - batch D (#1024)
aurbiztondo-splunk Jun 1, 2023
04e928b
Removing Reserved note in Org metrics (#1026)
aurbiztondo-splunk Jun 7, 2023
7f6ab20
Update metrics.yaml (#1027)
aurbiztondo-splunk Jun 7, 2023
b71440d
OD3972-dpm-faqs (#1028)
aurbiztondo-splunk Jun 27, 2023
4f05f45
update java lib (#1018)
breedx-splk Aug 8, 2023
ffd646f
Add missing org metrics per feedback (#1029)
trangl-splunk Aug 22, 2023
1bd5ef0
Resolve merge conflicts
trangl-splunk Aug 22, 2023
5b15a44
Merge changes in main to release (#1002)
bhillmer Aug 22, 2023
69c119c
Merge changes in main to release (#1002) (#1031)
trangl-splunk Aug 22, 2023
4bb258c
Fix link formatting
trangl-splunk Aug 22, 2023
a9de753
Resolve merge conflicts
trangl-splunk Aug 22, 2023
1064060
Fix link formatting (#1032)
trangl-splunk Aug 22, 2023
ecfc118
Fix link formatting (#1032) (#1033)
trangl-splunk Aug 22, 2023
4845ee1
[OD5656]: Remove Azure/GCP deprecated metrics (#1034)
aurbiztondo-splunk Sep 5, 2023
871124a
Remove duplicate metric (#1035)
mbechtold-splunk Sep 8, 2023
b6fed5a
Fix indentations for formatting
trangl-splunk Sep 18, 2023
f0073ba
Fix typos
trangl-splunk Sep 18, 2023
483036e
Update release (#1036)
trangl-splunk Sep 18, 2023
7448df4
Release (#1037)
trangl-splunk Sep 18, 2023
a43c6b3
Fix indentations for formatting
trangl-splunk Sep 18, 2023
a67429c
Fix indentations for formatting
trangl-splunk Sep 18, 2023
dde23fe
Fix typos
trangl-splunk Sep 18, 2023
0d0c1d8
Resolve merge conflicts
trangl-splunk Sep 18, 2023
6466edf
OD5829: Minor fix to APM bundled metrics (#1040)
aurbiztondo-splunk Nov 10, 2023
3ed0311
Revert "OD5829: Minor fix to APM bundled metrics (#1040)" (#1041)
aurbiztondo-splunk Nov 10, 2023
47de669
OD5829-apm-fix (#1042)
aurbiztondo-splunk Nov 13, 2023
30f8149
Added two org metrics (#1043)
aurbiztondo-splunk Jan 8, 2024
011dc0f
Merge branch 'master' into main
aurbiztondo-splunk Jan 8, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions activemq/SMART_AGENT_MONITOR.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,8 @@ configuration instructions below.

## Description

This integration primarily consists of the Smart Agent monitor `collectd/activemq`.
Below is an overview of that monitor.
**This integration primarily consists of the Smart Agent monitor `collectd/activemq`.
Below is an overview of that monitor.**

### Smart Agent Monitor

Expand Down Expand Up @@ -55,7 +55,7 @@ monitors: # All monitor config goes under this key
```

**For a list of monitor options that are common to all monitors, see [Common
Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).**
Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).**


| Config option | Required | Type | Description |
Expand Down Expand Up @@ -166,15 +166,15 @@ monitors` after configuring this monitor in a running agent instance.

### Legacy non-default metrics (version < 4.7.0)

**The following information only applies to agent versions prior to 4.7.0. If
**The following information only applies to agent version older than 4.7.0. If
you have a newer agent and have set `enableBuiltInFiltering: true` at the top
level of your agent config, see the section above. See upgrade instructions in
[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).**
[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).**

If you have a reference to the `whitelist.json` in your agent's top-level
`metricsToExclude` config option, and you want to emit metrics that are not in
that allow list, then you need to add an item to the top-level
`metricsToInclude` config option to override that allow list (see [Inclusion
filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just
that whitelist, then you need to add an item to the top-level
`metricsToInclude` config option to override that whitelist (see [Inclusion
filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just
copy the whitelist.json, modify it, and reference that in `metricsToExclude`.

91 changes: 28 additions & 63 deletions cassandra/SMART_AGENT_MONITOR.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,15 @@ configuration instructions below.

## Description

This integration primarily consists of the Smart Agent monitor `collectd/cassandra`.
Below is an overview of that monitor.
**This integration primarily consists of the Smart Agent monitor `collectd/cassandra`.
Below is an overview of that monitor.**

### Smart Agent Monitor


Monitors Cassandra using the Collectd GenericJMX plugin. This is
essentially a wrapper around the
[collectd-genericjmx](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/./collectd-genericjmx.md)[](sfx_link:java) monitor that comes with a
[collectd-genericjmx](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/./collectd-genericjmx.md)[](sfx_link:java) monitor that comes with a
set of predefined MBean definitions that a standard Cassandra deployment
will expose.

Expand All @@ -47,7 +47,7 @@ monitors: # All monitor config goes under this key
```

**For a list of monitor options that are common to all monitors, see [Common
Configuration](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../monitor-config.md#common-configuration).**
Configuration](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../monitor-config.md#common-configuration).**


| Config option | Required | Type | Description |
Expand Down Expand Up @@ -97,18 +97,6 @@ Metrics that are categorized as

These are the metrics available for this integration.

- `counter.cassandra.ClientRequest.CASRead.Latency.Count` (*cumulative*)<br> Count of transactional read operations since server start.
- `counter.cassandra.ClientRequest.CASRead.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing client transactional read requests.

It can be devided by `counter.cassandra.ClientRequest.CASRead.Latency.Count`
to find the real time transactional read latency.

- `counter.cassandra.ClientRequest.CASWrite.Latency.Count` (*cumulative*)<br> Count of transactional write operations since server start.
- `counter.cassandra.ClientRequest.CASWrite.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing client transactional write requests.

It can be devided by `counter.cassandra.ClientRequest.CASWrite.Latency.Count`
to find the real time transactional write latency.

- ***`counter.cassandra.ClientRequest.RangeSlice.Latency.Count`*** (*cumulative*)<br> Count of range slice operations since server start. This typically indicates a server overload condition.

If this value is increasing across the cluster then the cluster is too small for the application range slice load.
Expand All @@ -126,15 +114,14 @@ These are the metrics available for this integration.
- one or more clients are directing more load to this server than the others
- the server is experiencing hardware or software issues and may require maintenance.

- `counter.cassandra.ClientRequest.RangeSlice.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing range slice requests.
- ***`counter.cassandra.ClientRequest.RangeSlice.Unavailables.Count`*** (*cumulative*)<br> Count of range slice unavailables since server start. A non-zero value
means that insufficient replicas were available to fulfil a range slice
request at the requested consistency level.

This typically means that one or more nodes are down. To fix this condition,
any down nodes must be restarted, or removed from the cluster.

- ***`counter.cassandra.ClientRequest.Read.Latency.Count`*** (*cumulative*)<br> Count of read operations since server start.
- ***`counter.cassandra.ClientRequest.Read.Latency.Count`*** (*cumulative*)<br> Count of read operations since server start
- ***`counter.cassandra.ClientRequest.Read.Timeouts.Count`*** (*cumulative*)<br> Count of read timeouts since server start. This typically indicates a server overload condition.

If this value is increasing across the cluster then the cluster is too small for the application read load.
Expand All @@ -143,11 +130,6 @@ These are the metrics available for this integration.
- one or more clients are directing more load to this server than the others
- the server is experiencing hardware or software issues and may require maintenance.

- `counter.cassandra.ClientRequest.Read.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing client read requests.

It can be devided by `counter.cassandra.ClientRequest.Read.Latency.Count`
to find the real time read latency.

- ***`counter.cassandra.ClientRequest.Read.Unavailables.Count`*** (*cumulative*)<br> Count of read unavailables since server start. A non-zero value means
that insufficient replicas were available to fulfil a read request at
the requested consistency level. This typically means that one or more
Expand All @@ -163,11 +145,6 @@ These are the metrics available for this integration.
- one or more clients are directing more load to this server than the others
- the server is experiencing hardware or software issues and may require maintenance.

- `counter.cassandra.ClientRequest.Write.TotalLatency.Count` (*cumulative*)<br> The total number of microseconds elapsed in servicing client write requests.

It can be devided by `counter.cassandra.ClientRequest.Write.Latency.Count`
to find the real time write latency.

- ***`counter.cassandra.ClientRequest.Write.Unavailables.Count`*** (*cumulative*)<br> Count of write unavailables since server start. A non-zero value means
that insufficient replicas were available to fulfil a write request at
the requested consistency level.
Expand All @@ -180,34 +157,6 @@ These are the metrics available for this integration.
not increase steadily over time then the node may be experiencing
problems completing compaction operations.

- `counter.cassandra.Storage.Exceptions.Count` (*cumulative*)<br> Number of internal exceptions caught. Under normal exceptions this should be zero.

- ***`counter.cassandra.Storage.Load.Count`*** (*cumulative*)<br> Storage used for Cassandra data in bytes. Use this metric to see how much storage is being used for data by a Cassandra node.

The value of this metric is influenced by:
- Total data stored into the database
- compaction behavior

- `counter.cassandra.Storage.TotalHints.Count` (*cumulative*)<br> Total hints since node start. Indicates that write operations cannot be
delivered to a node, usually because a node is down. If this value is
increasing and all nodes are up then there may be some connectivity
issue between nodes in the cluster.

- ***`counter.cassandra.Storage.TotalHintsInProgress.Count`*** (*cumulative*)<br> Total pending hints. Indicates that write operations cannot be
delivered to a node, usually because a node is down. If this value is
increasing and all nodes are up then there may be some connectivity
issue between nodes in the cluster.

- `gauge.cassandra.ClientRequest.CASRead.Latency.50thPercentile` (*gauge*)<br> 50th percentile (median) of Cassandra transactional read latency.

- `gauge.cassandra.ClientRequest.CASRead.Latency.99thPercentile` (*gauge*)<br> 99th percentile of Cassandra transactional read latency.

- `gauge.cassandra.ClientRequest.CASRead.Latency.Max` (*gauge*)<br> Maximum Cassandra transactional read latency.
- `gauge.cassandra.ClientRequest.CASWrite.Latency.50thPercentile` (*gauge*)<br> 50th percentile (median) of Cassandra transactional write latency.

- `gauge.cassandra.ClientRequest.CASWrite.Latency.99thPercentile` (*gauge*)<br> 99th percentile of Cassandra transactional write latency.

- `gauge.cassandra.ClientRequest.CASWrite.Latency.Max` (*gauge*)<br> Maximum Cassandra transactional write latency.
- `gauge.cassandra.ClientRequest.RangeSlice.Latency.50thPercentile` (*gauge*)<br> 50th percentile (median) of Cassandra range slice latency. This value
should be similar across all nodes in the cluster. If some nodes have higher
values than the rest of the cluster then they may have more connected clients
Expand All @@ -218,7 +167,7 @@ These are the metrics available for this integration.
the rest of the cluster then they may have more connected clients or may be
experiencing heavier than usual compaction load.

- `gauge.cassandra.ClientRequest.RangeSlice.Latency.Max` (*gauge*)<br> Maximum Cassandra range slice latency.
- `gauge.cassandra.ClientRequest.RangeSlice.Latency.Max` (*gauge*)<br> Maximum Cassandra range slice latency
- ***`gauge.cassandra.ClientRequest.Read.Latency.50thPercentile`*** (*gauge*)<br> 50th percentile (median) of Cassandra read latency. This value should
be similar across all nodes in the cluster. If some nodes have higher
values than the rest of the cluster then they may have more connected
Expand All @@ -229,7 +178,7 @@ These are the metrics available for this integration.
the rest of the cluster then they may have more connected clients or
may be experiencing heavier than usual compaction load.

- ***`gauge.cassandra.ClientRequest.Read.Latency.Max`*** (*gauge*)<br> Maximum Cassandra read latency.
- ***`gauge.cassandra.ClientRequest.Read.Latency.Max`*** (*gauge*)<br> Maximum Cassandra read latency
- ***`gauge.cassandra.ClientRequest.Write.Latency.50thPercentile`*** (*gauge*)<br> 50th percentile (median) of Cassandra write latency. This value should
be similar across all nodes in the cluster. If some nodes have higher
values than the rest of the cluster then they may have more connected
Expand All @@ -245,6 +194,22 @@ These are the metrics available for this integration.
continually increasing then the node may be experiencing problems
completing compaction operations.

- ***`gauge.cassandra.Storage.Load.Count`*** (*gauge*)<br> Storage used for Cassandra data in bytes. Use this metric to see how much storage is being used for data by a Cassandra node.

The value of this metric is influenced by:
- Total data stored into the database
- compaction behavior

- `gauge.cassandra.Storage.TotalHints.Count` (*gauge*)<br> Total hints since node start. Indicates that write operations cannot be
delivered to a node, usually because a node is down. If this value is
increasing and all nodes are up then there may be some connectivity
issue between nodes in the cluster.

- ***`gauge.cassandra.Storage.TotalHintsInProgress.Count`*** (*gauge*)<br> Total pending hints. Indicates that write operations cannot be
delivered to a node, usually because a node is down. If this value is
increasing and all nodes are up then there may be some connectivity
issue between nodes in the cluster.


#### Group jvm
All of the following metrics are part of the `jvm` metric group. All of
Expand Down Expand Up @@ -274,15 +239,15 @@ monitors` after configuring this monitor in a running agent instance.

### Legacy non-default metrics (version < 4.7.0)

**The following information only applies to agent versions prior to 4.7.0. If
**The following information only applies to agent version older than 4.7.0. If
you have a newer agent and have set `enableBuiltInFiltering: true` at the top
level of your agent config, see the section above. See upgrade instructions in
[Old-style inclusion list filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#old-style-inclusion-list-filtering).**
[Old-style whitelist filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#old-style-whitelist-filtering).**

If you have a reference to the `whitelist.json` in your agent's top-level
`metricsToExclude` config option, and you want to emit metrics that are not in
that allow list, then you need to add an item to the top-level
`metricsToInclude` config option to override that allow list (see [Inclusion
filtering](https://github.com/signalfx/signalfx-agent/tree/main/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just
that whitelist, then you need to add an item to the top-level
`metricsToInclude` config option to override that whitelist (see [Inclusion
filtering](https://github.com/signalfx/signalfx-agent/tree/master/docs/monitors/../legacy-filtering.md#inclusion-filtering). Or you can just
copy the whitelist.json, modify it, and reference that in `metricsToExclude`.

Loading