-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allocation service changes for batch assignment #8888
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Gaurav Chandani <[email protected]>
Gradle Check (Jenkins) Run Completed with:
|
This PR is stalled because it has been open for 30 days with no activity. Remove stalled label or comment or this will be closed in 7 days. |
This PR was closed because it has been stalled for 7 days with no activity. |
Signed-off-by: Gaurav Chandani <[email protected]>
Need to revive this PR. |
Compatibility status:Checks if related components are compatible with change d3b1140 Incompatible componentsSkipped componentsCompatible components |
❌ Gradle check result for eecf79f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for eecf79f: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Gaurav Chandani <[email protected]>
❌ Gradle check result for c2be078: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
@@ -114,6 +118,22 @@ public AllocationService( | |||
this.shardsAllocator = shardsAllocator; | |||
this.clusterInfoService = clusterInfoService; | |||
this.snapshotsInfoService = snapshotsInfoService; | |||
this.settings = Settings.EMPTY; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can use constructor chaining for Line 117 - 121:
this(allocationDeciders,
shardsAllocator,
clusterInfoService,
snapshotsInfoService,
Settings.EMPTY);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
@@ -60,6 +60,13 @@ public interface ExistingShardsAllocator { | |||
Setting.Property.PrivateIndex | |||
); | |||
|
|||
public static final Setting<Boolean> EXISTING_SHARDS_ALLOCATOR_BATCH_MODE_ENABLED = Setting.boolSetting( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Rename to EXISTING_SHARDS_ALLOCATOR_BATCH_MODE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -60,6 +60,13 @@ public interface ExistingShardsAllocator { | |||
Setting.Property.PrivateIndex | |||
); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Java doc here, we are adding a new public setting. Highly recommend adding multi-line comment explaining the purpose of the setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
server/src/main/java/org/opensearch/cluster/routing/allocation/ExistingShardsAllocator.java
Show resolved
Hide resolved
1. Moved setting of batch enable disable in this PR 2. Added java docs Signed-off-by: Gaurav Chandani <[email protected]>
❌ Gradle check result for 3bb7dd8: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
RoutingNodes.UnassignedShards unassignedShards = allocation.routingNodes().unassigned(); | ||
RoutingNodes.UnassignedShards.UnassignedIterator iterator = unassignedShards.iterator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be clubbed into single line ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not possible, we need both variables later
ShardRouting shard = iterator.next(); | ||
currentAllocatorForShard = getAllocatorForShard(shard, allocation); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Single line ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@@ -548,6 +564,22 @@ private void allocateExistingUnassignedShards(RoutingAllocation allocation) { | |||
existingShardsAllocator.beforeAllocation(allocation); | |||
} | |||
|
|||
Boolean batchModeEnabled = EXISTING_SHARDS_ALLOCATOR_BATCH_MODE.get(settings); | |||
|
|||
if (batchModeEnabled && allocation.nodes().getMinNodeVersion().onOrAfter(Version.CURRENT)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to set this to the version in which we plan to release this feature. Please make a note of it. I think its fine to keep it current for safety purpose to avoid merging old version in a higher version release. But make sure it gets changed before merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You may want to control it better with mixed cluster rest/integ test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, right now it's CURRENT for main branch.
When it goes to 2.x - it'll go with 2.12 version
Then again in main it'll be changed to 2.12
Basically 3 phased commits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Gaurav614
There must be unit or integ test verifying the if-else logic. Based on commits, UT/integ checks will also be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the main branch, it has to be 3.0 and not Version.CURRENT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests are missing to check if batch/ non batch mode is picked correctly.
while (iterator.hasNext()) { | ||
ExistingShardsAllocator allocatorForShard = getAllocatorForShard(iterator.next(), allocation); | ||
if (currentAllocatorForShard.getClass().getName().equals(allocatorForShard.getClass().getName()) == false) { | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you plan to do with null
if allocator was found different? Shouldn't be possible right given there's no code bug? Do you think IllegalStateException
would be a better fit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wont be IllegalStateException with respect to AllocationService since it will fallback to non batched version.
Null is send because we didnt have same Allocators set
probably I will rename it to getAndVerifySameAllocatorForAllUnassignedShards
} | ||
allocator.allocateUnassignedBatch(allocation, false); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If allocator was found null, then execution would come out of this if block and continue with single shard allocation. Is this intended? Shouldn't we be throwing an exception from verifySameAllocatorForAllUnassignedShards
and remove this null check altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not it wont assign a single shard, but try to assign All unassigned Shards
@amkhar I think you made some suggestion to rename allocateUnassignedBatch
to allocateAllUnassignedShards
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this method ultimately allocates all unassigned shards. And UnassignedBatch
feels like a single batch is getting assigned.
So, we can change it to allocateAllUnassignedShards
. Feel free to suggest other better/relevant names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this complication. If the cluster is using single ExistingShardAllocator, then use batch mode otherwise let it fall back to single shard allocation. This should change later to batch mode code which is dealing with single ShardRouting once there is more confidence in the new code with batch mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a warning log that it can't use batch due to multiple ExistingShardAllocator, it is recommending to use one so that batch logic can run effectively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. So we are currently not allowing any CustomAllocator that implements the new method that we introduced to allocateUnassignedBatch
[renamed to allocateAllUnassignedShards] to run its implementation and we are using are own implementation for now.
Signed-off-by: Gaurav Chandani <[email protected]>
❌ Gradle check result for 6e645a0: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Gaurav Chandani <[email protected]>
❌ Gradle check result for b1994fa: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
@@ -548,6 +564,22 @@ private void allocateExistingUnassignedShards(RoutingAllocation allocation) { | |||
existingShardsAllocator.beforeAllocation(allocation); | |||
} | |||
|
|||
Boolean batchModeEnabled = EXISTING_SHARDS_ALLOCATOR_BATCH_MODE.get(settings); | |||
|
|||
if (batchModeEnabled && allocation.nodes().getMinNodeVersion().onOrAfter(Version.CURRENT)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the main branch, it has to be 3.0 and not Version.CURRENT
* @param allocation {@link RoutingAllocation} | ||
* @return {@link ExistingShardsAllocator} or null | ||
*/ | ||
private ExistingShardsAllocator getAndVerifySameAllocatorForAllUnassignedShards(RoutingAllocation allocation) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this elaborate check? Simply check if there are more than one ExistingShardAllocator, bail out and use single shard allocation, which will be eventually moved to batch code path with one shardRouting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using GA for existing one shardRouting.
Have different ExistingShardsAllocator implementation after this suggestion. Still open for more discussion https://github.com/opensearch-project/OpenSearch/pull/8746/files#r1432764512
} | ||
allocator.allocateUnassignedBatch(allocation, false); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this complication. If the cluster is using single ExistingShardAllocator, then use batch mode otherwise let it fall back to single shard allocation. This should change later to batch mode code which is dealing with single ShardRouting once there is more confidence in the new code with batch mode.
} | ||
allocator.allocateUnassignedBatch(allocation, false); | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a warning log that it can't use batch due to multiple ExistingShardAllocator, it is recommending to use one so that batch logic can run effectively.
server/src/main/java/org/opensearch/cluster/routing/allocation/ExistingShardsAllocator.java
Show resolved
Hide resolved
@@ -548,6 +564,22 @@ private void allocateExistingUnassignedShards(RoutingAllocation allocation) { | |||
existingShardsAllocator.beforeAllocation(allocation); | |||
} | |||
|
|||
Boolean batchModeEnabled = EXISTING_SHARDS_ALLOCATOR_BATCH_MODE.get(settings); | |||
|
|||
if (batchModeEnabled && allocation.nodes().getMinNodeVersion().onOrAfter(Version.CURRENT)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests are missing to check if batch/ non batch mode is picked correctly.
Hi @Gaurav614, the PR is stalled for some time. Do we have any updates? |
1. Made changes so that Allocation Service run only default implementation of batch mode 2. Renamed methods 3. Added and modified documenatation Signed-off-by: Gaurav Chandani <[email protected]>
❌ Gradle check result for 257f36d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
ImmutableOpenMap.Builder<String, ClusterState.Custom> customsBuilder = ImmutableOpenMap.builder(allocation.getCustoms()); | ||
final Map<String, ClusterState.Custom> customsBuilder = new HashMap<>(allocation.getCustoms()); | ||
customsBuilder.put(RestoreInProgress.TYPE, updatedRestoreInProgress); | ||
newStateBuilder.customs(customsBuilder.build()); | ||
newStateBuilder.customs(customsBuilder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how this diff got its way, Will remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Gaurav614 - is it resolved ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments @Gaurav614
@@ -183,9 +200,9 @@ private ClusterState buildResult(ClusterState oldState, RoutingAllocation alloca | |||
if (restoreInProgress != null) { | |||
RestoreInProgress updatedRestoreInProgress = allocation.updateRestoreInfoWithRoutingChanges(restoreInProgress); | |||
if (updatedRestoreInProgress != restoreInProgress) { | |||
ImmutableOpenMap.Builder<String, ClusterState.Custom> customsBuilder = ImmutableOpenMap.builder(allocation.getCustoms()); | |||
final Map<String, ClusterState.Custom> customsBuilder = new HashMap<>(allocation.getCustoms()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason we made it mutable from immutable map
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out this diff. I am not sure how this diff got way. Will restore to whats there in main.
Use batch mode if enabled and there is no custom allocator set for Allocation service | ||
*/ | ||
Boolean batchModeEnabled = EXISTING_SHARDS_ALLOCATOR_BATCH_MODE.get(settings); | ||
if (batchModeEnabled && allocation.nodes().getMinNodeVersion().onOrAfter(Version.CURRENT) && existingShardsAllocators.size() == 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the check for ==2 here is for GatewayAllocator & BatchGatewayAllocator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this version check should be against 3.0.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
server/src/main/java/org/opensearch/cluster/routing/allocation/AllocationService.java
Show resolved
Hide resolved
@@ -670,7 +704,7 @@ private ExistingShardsAllocator getAllocatorForShard(ShardRouting shardRouting, | |||
final String allocatorName = ExistingShardsAllocator.EXISTING_SHARDS_ALLOCATOR_SETTING.get( | |||
routingAllocation.metadata().getIndexSafe(shardRouting.index()).getSettings() | |||
); | |||
final ExistingShardsAllocator existingShardsAllocator = existingShardsAllocators.get(allocatorName); | |||
ExistingShardsAllocator existingShardsAllocator = existingShardsAllocators.get(allocatorName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove final?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will Add back
server/src/main/java/org/opensearch/cluster/routing/allocation/AllocationService.java
Show resolved
Hide resolved
* If enable to true then it expects all indices of the shard to use same {@link ExistingShardsAllocator}, otherwise | ||
* Allocation Service will fallback to default implementation i.e. {@link ExistingShardsAllocator#allocateUnassigned(ShardRouting, RoutingAllocation, UnassignedAllocationHandler)} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you dont need this.
* Enable this setting if your ExistingShardAllocator is implementing the | ||
* {@link ExistingShardsAllocator#allocateAllUnassignedShards(RoutingAllocation, boolean)} method. | ||
* The default implementation of this method is not optimized and assigns shards one by one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By enabling this setting, GatewayAllocator will use batch implementation by default. In case of custom ExistingShardAllocator , the performance benefits will be visible if {@link ExistingShardsAllocator#allocateAllUnassignedShards(RoutingAllocation, boolean)} method is implemented efficiently for batch mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, if custom Allocator is set we wont be even running allocateAllUnassignedShards in AllocationService.
This was discussed here
* {@link ShardsBatchGatewayAllocator}. Right now even if plugin implements it, AllocationService will run the | ||
* default implementation to enable Batch mode of assignment | ||
* | ||
* TODO: Currently its implementation is WIP for GatewayAllocator so setting enabling wont have any effect | ||
* https://github.com/opensearch-project/OpenSearch/issues/5098 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now and the below part is un-necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can say this setting is experimental at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay , will update
Signed-off-by: Gaurav Chandani <[email protected]>
❌ Gradle check result for 1574fc2: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
No comment, @amkhar need to resolve the conflict and this PR we will merge in the end, once all the other feature PRs are merged. |
Signed-off-by: Aman Khare <[email protected]>
❌ Gradle check result for d3b1140: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Merging PR: opensearch-project#8888 into this PR since AllocationService changes were required to run Integ tests for testing the flow end to end. Signed-off-by: Gaurav Chandani <[email protected]>
This PR is stalled because it has been open for 30 days with no activity. |
Description
This pull request is part of the improvement #5098
It is mainly focused around changes for Allocation Service with respect to batch assignment. The PR is not completed as PR for adding a setting for batch mode is pending. For now we are taking a boolean value, and will later change it to fetch from the setting once those changes are ready.
The PR is dependent on following PRs:
#8742
#8218
#8356
#8746
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.