Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Scroll API returns a recently deleted doc #7266

Open
Yury-Fridlyand opened this issue Apr 21, 2023 · 7 comments
Open

[BUG] Scroll API returns a recently deleted doc #7266

Yury-Fridlyand opened this issue Apr 21, 2023 · 7 comments
Labels
bug Something isn't working distributed framework Search Search query, autocomplete ...etc

Comments

@Yury-Fridlyand
Copy link

Yury-Fridlyand commented Apr 21, 2023

Describe the bug

Scroll API returns a doc which was deleted while scrolling if it goes next.

To Reproduce

Steps to reproduce the behavior:

  1. Create an index
PUT http://localhost:9200/pmtest
  1. Add few docs
PUT http://localhost:9200/pmtest/_doc/1?refresh=true
{
    "age": 30
}
PUT http://localhost:9200/pmtest/_doc/2?refresh=true
{
    "age": 30
}
  1. Start scrolling
POST http://localhost:9200/pmtest/_search?scroll=10m
{
    "size": 1
}
  1. Delete last doc
DELETE http://localhost:9200/pmtest/_doc/2?refresh=true

Note: refresh=true
5. Keep scrolling

POST http://localhost:9200/_search/scroll
{
    "scroll": "10m",
    "scroll_id": "<scroll ID>"
}
  1. Last (deleted) doc returned

Expected behavior

Search should skip deleted doc

Host/Environment (please complete the following information):

3.0.0-SHAPSHOT @ ee305d0

Additional context

Sometimes this doesn't happen if I delete a doc 3 when cursor is between 1 and 2. Timing?

@anasalkouz
Copy link
Member

Hi @Yury-Fridlyand, thanks for reporting it. Based on my understanding, seems this is an expected behavior, scroll api works as point on time search. please let me know if you still think it is a bug.

@Yury-Fridlyand
Copy link
Author

Hi @anasalkouz, thank you for reply!
Does scroll API create a temporary PIT under the hood?
Documentation lacks this. Could you please, add more information to documentation?

@nandi-github
Copy link

@Yury-Fridlyand Can you help understand the usecase for this API, where and how it would be used ?

@anasalkouz
Copy link
Member

anasalkouz commented Apr 27, 2023

@Yury-Fridlyand Thats fair point, this should be clearly documented.
Based on my understanding scroll API will freeze the data returned and any add/update/delete indexing operations after the scroll initiated won't be part of the search response. @macohen @msfroh to confirm.

Once it's confirm, I think we can transfer this to documentation.

@nandi-github
Copy link

@anasalkouz Lets prioritize this Scroll API ask. Is it possible to estimate the work on it

@Yury-Fridlyand
Copy link
Author

Issue update: added docs are also ignored by scroll.

Can you help understand the usecase for this API, where and how it would be used ?

Actually it is not a real use-case. I'm working on cursor feature in SQL plugin which uses scroll to paginate the results. I tried to test index modification during paging and found this issue.

Maybe it is my misunderstanding the scroll feature. Please, clarify how scroll works, whether it makes an index snapshot or not. We can fix this issue and/or update the documentation.

Thanks.

@bharath-techie
Copy link
Contributor

Scroll under the hood creates search contexts which are references to the associated index segments at the time of scroll creation. So the data returned will always be the from the same segments, providing a 'point in time' snapshot view for the query.

The updates to opensearch such as create doc, delete doc etc will not be reflected in the scroll. In the documentation, I see the relevant doc for the same ( applicable for docs deleted after the timestamp as well )

The scroll operation corresponds to a specific timestamp. It doesn’t consider documents added after that timestamp as potential results.

@shwetathareja shwetathareja added the Search Search query, autocomplete ...etc label Mar 26, 2024
@getsaurabh02 getsaurabh02 moved this from 🆕 New to Later (6 months plus) in Search Project Board Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working distributed framework Search Search query, autocomplete ...etc
Projects
Status: Later (6 months plus)
Development

No branches or pull requests

6 participants