-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ShardBatchCache to support caching for TransportNodesListGatewayStartedShardsBatch #12504
Add ShardBatchCache to support caching for TransportNodesListGatewayStartedShardsBatch #12504
Conversation
…ch of shards Signed-off-by: Aman Khare <[email protected]>
Signed-off-by: Aman Khare <[email protected]>
Signed-off-by: Aman Khare <[email protected]>
Signed-off-by: Aman Khare <[email protected]>
Signed-off-by: Aman Khare <[email protected]>
Compatibility status:Checks if related components are compatible with change 08763ed Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/performance-analyzer.git] |
❌ Gradle check result for bcbc00a: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 644d908: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/TransportNodesListGatewayStartedShardsBatch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/TransportNodesListGatewayStartedShardsBatch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Aman Khare <[email protected]>
❌ Gradle check result for ba6cbb4: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Aman Khare <[email protected]>
Signed-off-by: Aman Khare <[email protected]>
Signed-off-by: Aman Khare <[email protected]>
❌ Gradle check result for d61477f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Flaky test : #12197 |
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/AsyncShardBatchFetch.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/TransportNodesGatewayStartedShardHelper.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/TransportNodesListGatewayStartedShardsBatch.java
Show resolved
Hide resolved
server/src/main/java/org/opensearch/gateway/TransportNodesListGatewayStartedShardsBatch.java
Show resolved
Hide resolved
server/src/test/java/org/opensearch/gateway/ShardBatchCacheTests.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Aman Khare <[email protected]>
❕ Gradle check result for 65229fc: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @amkhar too many explicit type conversions and reflection isn't usually the best way to get around structuring classes
server/src/main/java/org/opensearch/gateway/ShardBatchResponseFactory.java
Show resolved
Hide resolved
Signed-off-by: Aman Khare <[email protected]>
❌ Gradle check result for 08763ed: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❕ Gradle check result for 08763ed: UNSTABLE
Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Flaky test : #5426 |
7103e56
into
opensearch-project:main
…tartedShardsBatch (#12504) Signed-off-by: Aman Khare <[email protected]> (cherry picked from commit 7103e56) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…tartedShardsBatch (#12504) (#13156) (cherry picked from commit 7103e56) Signed-off-by: Aman Khare <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…tartedShardsBatch (opensearch-project#12504) Signed-off-by: Aman Khare <[email protected]> Signed-off-by: Shivansh Arora <[email protected]>
Description
Add a new ShardBatchCache which will handle storing the responses of transport actions like TransportNodesListGatewayStartedShardsBatch or the transport action for replica shards being pushed via #8957.
A new cache class is being written as storing strategy is changed from ShardCache (written #12441).
ShardCache - stores the responses of transport actions as it is in NodeEntry class
https://github.com/opensearch-project/OpenSearch/pull/12441/files#diff-e2790d7ec0cf48617430d2352e7142c297eaea172fd6e2d34969e60ddf7f9a68R73
https://github.com/amkhar/OpenSearch/blob/3125b948029609f354d3153f8ca6391638daefc7/server/src/main/java/org/opensearch/gateway/AsyncShardFetch.java#L468-L475
But the response of Transport actions like TransportNodesListGatewayStartedShardsBatch are a map
Storing this response as it is will create repetition for ShardId object, as the responses needs to be stored for every node.
Map<NodeId, Map<ShardId, NodeGatewayStartedShard>>
To solve the problem, storing the NodeGatewayStartedShard in an array and keeping ShardId to arrayIndex in a map
Using these two data structures, repetition is avoided and array index is used for putting and getting the data from cache.
Failure handling is also different from normal ShardCache, as this is a batch of ShardIds. Based on unknown failures for a ShardId, we need to retry that shard only in next run of reroute. And allocate all other shards of the batch.
Related Issues
Resolves ##12248
This PR is dependent on
#12441
#8356
Check List
Commit changes are listed out in CHANGELOG.md file (See: Changelog)Public documentation issue/PR createdBy submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.