Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Remote Store] Add remote segment transfer stats on NodesStats API #9168

Merged
merged 18 commits into from
Aug 14, 2023

Conversation

shourya035
Copy link
Member

@shourya035 shourya035 commented Aug 8, 2023

Description

Adding remote segment transfer stats on the NodesStats API output. As of now, we are including the following metrics:

  • total_uploads.started_bytes: Amount of segment payload attempted to be uploaded to the remote store
  • total_uploads.succeeded_bytes: Amount of segment payload successfully uploaded to the remote store
  • total_uploads.failed_bytes: Amount of segment payload failed to be uploaded to the remote store
  • max_refresh_time_lag_in_millis: Max refresh lag in milliseconds across all the shards on the node
  • max_refresh_size_lag_in_bytes: Max refresh lag in size across all the shards on the node
  • total_downloads.started_bytes: Amount of segment payload attempted to be downloaded from the remote store
  • total_downloads.succeeded_bytes: Amount of segment payload successfully downloaded from the remote store
  • total_downloads.failed_bytes: Amount of segment payload failed to be downloaded from the remote store

These are added to the indices.segment section of the NodesStats API output

~ ❯ curl --silent 'http://localhost:9200/_nodes/opensearch-node*/stats/indices/segments' | jq
{
  "_nodes": {
    "total": 2,
    "successful": 2,
    "failed": 0
  },
  "cluster_name": "opensearch-cluster",
  "nodes": {
    "yKBw3ByYSd6KZnx_qOb9NA": {
      "timestamp": 1691500774603,
      "name": "opensearch-node2",
      "transport_address": "172.18.0.3:9300",
      "host": "172.18.0.3",
      "ip": "172.18.0.3:9300",
      "roles": [
        "data",
        "ingest",
        "remote_cluster_client"
      ],
      "attributes": {
        "shard_indexing_pressure_enabled": "true"
      },
      "indices": {
        "segments": {
          "count": 3,
          "memory_in_bytes": 0,
          "terms_memory_in_bytes": 0,
          "stored_fields_memory_in_bytes": 0,
          "term_vectors_memory_in_bytes": 0,
          "norms_memory_in_bytes": 0,
          "points_memory_in_bytes": 0,
          "doc_values_memory_in_bytes": 0,
          "index_writer_memory_in_bytes": 0,
          "version_map_memory_in_bytes": 0,
          "fixed_bit_set_memory_in_bytes": 0,
          "max_unsafe_auto_id_timestamp": -9223372036854776000,
          "remote_store": {
            "upload": {
              "total_uploads": {
                "started_bytes": 0,
                "succeeded_bytes": 0,
                "failed_bytes": 0
              },
              "max_refresh_time_lag_in_millis": 0,
              "max_refresh_size_lag_in_bytes": 0
            },
            "download": {
              "total_downloads": {
                "started_bytes": 56529,
                "succeeded_bytes": 56529,
                "failed_bytes": 0
              }
            }
          },
          "file_sizes": {}
        }
      }
    },
    "EDUazBdFTYK9-Wij2HtTiQ": {
      "timestamp": 1691500774602,
      "name": "opensearch-node1",
      "transport_address": "172.18.0.2:9300",
      "host": "172.18.0.2",
      "ip": "172.18.0.2:9300",
      "roles": [
        "data",
        "ingest",
        "remote_cluster_client"
      ],
      "attributes": {
        "shard_indexing_pressure_enabled": "true"
      },
      "indices": {
        "segments": {
          "count": 3,
          "memory_in_bytes": 0,
          "terms_memory_in_bytes": 0,
          "stored_fields_memory_in_bytes": 0,
          "term_vectors_memory_in_bytes": 0,
          "norms_memory_in_bytes": 0,
          "points_memory_in_bytes": 0,
          "doc_values_memory_in_bytes": 0,
          "index_writer_memory_in_bytes": 205452,
          "version_map_memory_in_bytes": 426,
          "fixed_bit_set_memory_in_bytes": 0,
          "max_unsafe_auto_id_timestamp": -1,
          "remote_store": {
            "upload": {
              "total_uploads": {
                "started_bytes": 56759,
                "succeeded_bytes": 56759,
                "failed_bytes": 0
              },
              "max_refresh_time_lag_in_millis": 0,
              "max_refresh_size_lag_in_bytes": 0
            },
            "download": {
              "total_downloads": {
                "started_bytes": 0,
                "succeeded_bytes": 0,
                "failed_bytes": 0
              }
            }
          },
          "file_sizes": {}
        }
      }
    }
  }
}

Added integ test cases to ensure that:

  • These stats are showing up as 0 for nodes without any remote-store backed indices
  • These stats also shows up as 0 for cluster manager nodes

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed per the DCO using --signoff
  • Commit changes are listed out in CHANGELOG.md file (See: Changelog)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2023

Gradle Check (Jenkins) Run Completed with:

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:



> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/security-analytics.git]
Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

BUILD SUCCESSFUL in 29m 14s

Signed-off-by: Shourya Dutta Biswas <[email protected]>
@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:



> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/security-analytics.git]
Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

BUILD SUCCESSFUL in 29m 32s

@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2023

Gradle Check (Jenkins) Run Completed with:

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:



> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer.git, https://github.com/opensearch-project/security-analytics.git]
Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

BUILD SUCCESSFUL in 29m 42s

@github-actions
Copy link
Contributor

github-actions bot commented Aug 8, 2023

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2023

Gradle Check (Jenkins) Run Completed with:

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:



> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer.git]
Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

BUILD SUCCESSFUL in 33m

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:



> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer.git]
Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

BUILD SUCCESSFUL in 27m 27s

@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2023

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Shourya Dutta Biswas <[email protected]>
@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2023

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Shourya Dutta Biswas <[email protected]>
@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:



> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer.git]
Compatible components: [https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/reporting.git]

BUILD SUCCESSFUL in 27m 47s

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:



> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer.git]
Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

BUILD SUCCESSFUL in 31m 54s

@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2023

Gradle Check (Jenkins) Run Completed with:

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:



> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/performance-analyzer.git]
Compatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

BUILD SUCCESSFUL in 28m 18s

@github-actions
Copy link
Contributor

github-actions bot commented Aug 9, 2023

Gradle Check (Jenkins) Run Completed with:

@codecov
Copy link

codecov bot commented Aug 9, 2023

Codecov Report

Merging #9168 (ef0f6e2) into main (1162865) will increase coverage by 0.04%.
Report is 16 commits behind head on main.
The diff coverage is 93.85%.

@@             Coverage Diff              @@
##               main    #9168      +/-   ##
============================================
+ Coverage     71.03%   71.07%   +0.04%     
- Complexity    57318    57404      +86     
============================================
  Files          4770     4771       +1     
  Lines        270540   270623      +83     
  Branches      39561    39561              
============================================
+ Hits         192186   192356     +170     
+ Misses        62147    62081      -66     
+ Partials      16207    16186      -21     
Files Changed Coverage Δ
...rc/main/java/org/opensearch/index/store/Store.java 80.25% <ø> (+0.96%) ⬆️
...ch/indices/recovery/PeerRecoveryTargetService.java 52.30% <0.00%> (-5.27%) ⬇️
...ava/org/opensearch/index/engine/SegmentsStats.java 72.44% <72.72%> (-1.70%) ⬇️
...in/java/org/opensearch/index/shard/IndexShard.java 69.79% <88.88%> (+0.71%) ⬆️
...rg/opensearch/index/engine/ReplicaFileTracker.java 91.66% <91.66%> (ø)
...rg/opensearch/index/remote/RemoteSegmentStats.java 97.61% <97.61%> (ø)
.../opensearch/index/engine/NRTReplicationEngine.java 74.37% <100.00%> (+0.48%) ⬆️
...earch/index/store/RemoteSegmentStoreDirectory.java 88.85% <100.00%> (+0.98%) ⬆️
...ices/replication/RemoteStoreReplicationSource.java 90.90% <100.00%> (+0.16%) ⬆️
.../indices/replication/SegmentReplicationTarget.java 89.00% <100.00%> (-0.72%) ⬇️

... and 469 files with indirect coverage changes

Signed-off-by: Shourya Dutta Biswas <[email protected]>
@shourya035 shourya035 changed the title [Remote Store] Remote segment transfer stats on NodesStats API [Remote Store] Add remote segment transfer stats on NodesStats API Aug 9, 2023
Signed-off-by: Shourya Dutta Biswas <[email protected]>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

Signed-off-by: Shourya Dutta Biswas <[email protected]>
Signed-off-by: Shourya Dutta Biswas <[email protected]>
@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@opensearch-trigger-bot
Copy link
Contributor

Compatibility status:



> Task :checkCompatibility
Incompatible components: [https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/performance-analyzer.git]
Compatible components: [https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git]

BUILD SUCCESSFUL in 31m 18s

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@sachinpkale
Copy link
Member

Overall changes LGTM.

@sachinpkale
Copy link
Member

Build is failing due to flaky test: #9116
Re-triggered.

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

  • RESULT: UNSTABLE ❕
  • TEST FAILURES:
      1 org.opensearch.repositories.azure.AzureBlobContainerRetriesTests.testWriteBlobWithRetries

@sachinpkale sachinpkale merged commit c9dbd90 into opensearch-project:main Aug 14, 2023
@shourya035 shourya035 deleted the stats-rollup branch August 14, 2023 10:55
shourya035 added a commit to shourya035/OpenSearch that referenced this pull request Aug 14, 2023
@@ -1377,6 +1384,12 @@ public MergeStats mergeStats() {
public SegmentsStats segmentStats(boolean includeSegmentFileSizes, boolean includeUnloadedSegments) {
SegmentsStats segmentsStats = getEngine().segmentsStats(includeSegmentFileSizes, includeUnloadedSegments);
segmentsStats.addBitsetMemoryInBytes(shardBitsetFilterCache.getMemorySizeInBytes());
// Populate remote_store stats only if the index is remote store backed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are doing an add here and one explicit add in NoOpEngine.java class -


This seem to be adding same stats twice.

downloadBytesFailed = in.readLong();
downloadBytesSucceeded = in.readLong();
maxRefreshTimeLag = in.readLong();
maxRefreshBytesLag = in.readLong();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can this be cumulative refresh bytes lag itself?

Copy link
Member

@ashking94 ashking94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left some comments. do take a look.

shourya035 added a commit to shourya035/OpenSearch that referenced this pull request Aug 16, 2023
kaushalmahi12 pushed a commit to kaushalmahi12/OpenSearch that referenced this pull request Sep 12, 2023
brusic pushed a commit to brusic/OpenSearch that referenced this pull request Sep 25, 2023
shiv0408 pushed a commit to Gaurav614/OpenSearch that referenced this pull request Apr 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants