-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flaky Test] Fix Flaky Test SearchTimeoutIT.testSimpleTimeout #16828
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #16828 +/- ##
============================================
- Coverage 72.11% 72.01% -0.11%
+ Complexity 65237 65185 -52
============================================
Files 5318 5318
Lines 304003 304011 +8
Branches 43992 43993 +1
============================================
- Hits 219228 218925 -303
- Misses 66874 67161 +287
- Partials 17901 17925 +24 ☔ View full report in Codecov by Sentry. |
Signed-off-by: kkewwei <[email protected]>
❌ Gradle check result for d1fc8be: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
SpecificClusterManagerNodesIT.testElectOnlyBetweenClusterManagerNodes #15944 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kkewwei this was a great find!
Thanks @kkewwei , I am actually a bit surprised by your findings
The query phase has to be terminated early by timeout, right? So it should be not much longer then timeout itself? |
❕ Gradle check result for d1fc8be: UNSTABLE Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure. |
Description
When numDocs=1000:
testSimpleTimeout
will cost several minutes, when scoring each doc, it will cost 500ms, it's a long time to iterating all the doc inqueryphase
.OpenSearch/server/src/internalClusterTest/java/org/opensearch/search/SearchTimeoutIT.java
Line 138 in 5aa6509
ReaderContext
is created before executingqueryphase
and released after thefetchphase
.ReaderContext
is 1min(determined bysearch.keep_alive_interval
)queryPhase
costs too much time, theReaderContext
may be released beforefetchphase
, so thefetch/Id
will be failed, which hit the case.Related Issues
Resolves #16056 #9401
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.