-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix SimpleNestedExplainIT.testExplainMultipleDocs flakiness #12776
Conversation
f9cb4f5
to
fce9d57
Compare
❌ Gradle check result for f9cb4f5: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Compatibility status:Checks if related components are compatible with change 8e3688e Incompatible componentsSkipped componentsCompatible componentsCompatible components: [https://github.com/opensearch-project/custom-codecs.git, https://github.com/opensearch-project/observability.git, https://github.com/opensearch-project/job-scheduler.git, https://github.com/opensearch-project/neural-search.git, https://github.com/opensearch-project/cross-cluster-replication.git, https://github.com/opensearch-project/k-nn.git, https://github.com/opensearch-project/security-analytics.git, https://github.com/opensearch-project/geospatial.git, https://github.com/opensearch-project/flow-framework.git, https://github.com/opensearch-project/notifications.git, https://github.com/opensearch-project/asynchronous-search.git, https://github.com/opensearch-project/sql.git, https://github.com/opensearch-project/opensearch-oci-object-storage.git, https://github.com/opensearch-project/reporting.git, https://github.com/opensearch-project/anomaly-detection.git, https://github.com/opensearch-project/index-management.git, https://github.com/opensearch-project/common-utils.git, https://github.com/opensearch-project/ml-commons.git, https://github.com/opensearch-project/performance-analyzer-rca.git, https://github.com/opensearch-project/security.git, https://github.com/opensearch-project/alerting.git, https://github.com/opensearch-project/performance-analyzer.git] |
FYI I think the "resolves" issue is wrong: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I do not understand the problem correctly, but If the issue here is background merges are happening between the concurrent and non-concurrent search queries, would a simpler solution be to add a small sleep after indexing is completed so we wait for background merges to complete before we perform the search request?
Or we could even force merge to 2 segments ourselves?
It seems like making deletion of bogus docs optional is a pretty heavy duty solution to what seems like a pretty simple problem.
server/src/internalClusterTest/java/org/opensearch/search/nested/SimpleNestedExplainIT.java
Outdated
Show resolved
Hide resolved
test/framework/src/main/java/org/opensearch/test/OpenSearchIntegTestCase.java
Outdated
Show resolved
Hide resolved
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #12776 +/- ##
============================================
- Coverage 71.42% 71.38% -0.04%
- Complexity 59978 60282 +304
============================================
Files 4985 5011 +26
Lines 282275 283659 +1384
Branches 40946 41117 +171
============================================
+ Hits 201603 202501 +898
- Misses 63999 64425 +426
- Partials 16673 16733 +60 ☔ View full report in Codecov by Sentry. |
@jed326 I thought the same, but looks like even a sleep time of 5 seconds or force merge (which defeats the purpose of indexing multiple docs)doesn't guarantee for successful runs of the tests. Hence, I came up with this approach of not deleting the bogus docs before the search. |
test/framework/src/main/java/org/opensearch/test/OpenSearchIntegTestCase.java
Outdated
Show resolved
Hide resolved
Instead of adding bogus docs could we just add more "real" docs instead? |
I think it would be same as calling indexRandomForSlices function with deletion of bogus docs set to false, which is currently happening. This would cause code duplication. |
This would ingest bogus docs while what I was suggesting is modify the test to ingest more (looks like only 1 more would be needed?) non-bogus docs. Overall it feels like not deleting bogus docs is a sledgehammer solution approach here that I don't think is generically useful for other tests so in my opinion it looks like a better solution to modify the singular test that is flaky. |
fce9d57
to
8e3688e
Compare
Signed-off-by: Neetika Singhal <[email protected]>
8e3688e
to
d346c9f
Compare
ok, so I modified the change to ingest one more doc and just have one shard for this test class such that multiple slices are created and there is no need to call the indexRandomForMultipleSlices for this test. As you said, we can also avoid the modification of the function as well. @jed326 thanks for your feedback. |
❌ Gradle check result for 8e3688e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, I think this is a much more straightforward solution. Thanks!
@reta could u also take a look and merge if it looks good to you? |
server/src/internalClusterTest/java/org/opensearch/search/nested/SimpleNestedExplainIT.java
Show resolved
Hide resolved
server/src/internalClusterTest/java/org/opensearch/search/nested/SimpleNestedExplainIT.java
Show resolved
Hide resolved
Signed-off-by: Neetika Singhal <[email protected]> (cherry picked from commit 4010ff1) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…12859) (cherry picked from commit 4010ff1) Signed-off-by: Neetika Singhal <[email protected]> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
…ch-project#12776) Signed-off-by: Neetika Singhal <[email protected]> Signed-off-by: Shivansh Arora <[email protected]>
Description
The explain itself is coming from Lucene: https://github.com/apache/lucene/blob/a6f70ad2bb0b682eb65feb522358ee6d16cad766/lucene/join/src/java/org/apache/lucene/search/join/ToParentBlockJoinQuery.java#L432-L440
Here's some info on the docIds in lucene:
A DocId in lucene is not actually unique to the Index but is unique to a Segment. Lucene does this mainly to optimize writing and compression. Since it is only unique to a Segment, how can a Doc be uniquely identified at the Index level? The solution is simple. The segments are ordered. To take a simple example, an Index has two segments and each segment has 100 docs respectively. The DocId's in the Segment are 0-100 but when they are converted to the Index level, the range of the DocId's in the second Segment is converted to 100-200.
DocId's are unique within a Segment, numbered progressively from zero. But this does not mean that the DocId's are continuous. When a Doc is deleted, there is a gap.
The DocId corresponding to a document can change, usually when Segments are merged.
The test fails on the validation of range of the docIds in the assertion, the range changes as with indexRandomForMultipleSlices function there are several bogus documents ingested and deleted which could trigger background merges and cause the range of the docIds matched with the search query to change.
One of the solutions is to not the delete the bogus documents as part of the indexRandomForMultipleSlices, perform search for concurrent search and non-concurrent search and delete the bogus docs at the end. This is done as part of this PR.
Validated the success of the tests for 500 times.
Relates: #11681
Related Issues
Resolves #12318
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.