Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Long wildcard query on wildcard field causes too_many_clauses exception #15851

Open
HUSTERGS opened this issue Sep 9, 2024 · 2 comments
Labels
bug Something isn't working Search Search query, autocomplete ...etc

Comments

@HUSTERGS
Copy link
Contributor

HUSTERGS commented Sep 9, 2024

Describe the bug

long wildcard query on wildcard field causes too_many_clauses exception, the matchAllTermsQuery function in WildcardFieldMapper.java directly concat all the terms generated by NGrams. Maybe the better way is to test and restrain the total clause number to a reasonable degree, so even long search strings can proceed as expected instead of throwing an exception. The only downside is verification phase may take longer time, because the qualified document may increase.

Related component

Search

To Reproduce

  1. Create a simple index contains wildcard field
curl -X PUT "http://localhost/my-index" -H 'Content-Type: application/json' -d'
{
  "mappings": {
    "properties": {
      "long_text": {
        "type": "wildcard"
      }
    }
  }
}
'
  1. Generate and verify a random long string for search
MY_LONG_STRING=`openssl rand -base64 2000 | tr -dc A-Za-z0-9`
MY_LONG_STRING=${MY_LONG_STRING:0:2000}
echo ${#MY_LONG_STRING}
  1. Search on the wildcard field
curl -X GET "http://localhost/my-index/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
      "wildcard": {
          "long_text": "'"${MY_LONG_STRING}"'"
      }
  }
}
'
  1. Get the result
{"error":{"root_cause":[{"type":"query_shard_exception","reason":"failed to create query: maxClauseCount is set to 1024","index":"my-index","index_uuid":"HLjKYimwRx2cVkiqzvvj5Q"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"my-index","node":"jRDKdrICRqmOdhG5fW33Pg","reason":{"type":"query_shard_exception","reason":"failed to create query: maxClauseCount is set to 1024","index":"my-index","index_uuid":"HLjKYimwRx2cVkiqzvvj5Q","caused_by":{"type":"too_many_clauses","reason":"maxClauseCount is set to 1024"}}}]},"status":400}%

Expected behavior

Return the expected search result even for long search strings

Additional Details

No response

@HUSTERGS HUSTERGS added bug Something isn't working untriaged labels Sep 9, 2024
@github-actions github-actions bot added the Search Search query, autocomplete ...etc label Sep 9, 2024
@msfroh
Copy link
Collaborator

msfroh commented Sep 11, 2024

[Search Triage] This is a good catch! Thanks @HUSTERGS. Limiting the clause count is a good idea. We may want to keep clauses in a priority queue of limited size, where the comparator keeps the lowest-frequency terms.

@HUSTERGS -- are you able to take that on? Need any help with it?

@msfroh msfroh removed the untriaged label Sep 11, 2024
@HUSTERGS
Copy link
Contributor Author

Yes, I can try to fix that. When bugfix ready I'll comment under this issue @msfroh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Search Search query, autocomplete ...etc
Projects
Status: 🆕 New
Development

No branches or pull requests

2 participants