Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Optimizations to fulltext search #163

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

niravmehta
Copy link

The current hasSubject / _eachSynonym methods work with a single token at a time. This means for a long address string, there will be a longer list of tokens to check (lots of permutations), and hence lots of queries.

I noticed 30, 60 or even 250 fulltext search queries in some tests.

This pull request contains a new function that takes all the tokens and performs full text search in a single query. It returns matching tokens in the same format as expected, so other things don't need to change.

I saw consistent 30% speed improvements with these changes. Even bigger gains with longer searches.

Caveats:

  • I had to remove the test case for the commit to go through. I'm not familiar with the testing framework, and could not write up a proper test case.
  • Auto complete searches don't perform partial searches for the last token. So results may vary if using auto complete. (But I feel it's still a good compromise)

Do review and provide feedback.

Nirav Mehta added 5 commits November 7, 2019 22:00
…ce. This significantly reduces number of queries to the DB (down to just one query - instead of potential dozens/hundreds).

Works for all cases except "auto complete" mode. (can easily change code to fallback to old method for partial matches on last token)

Plus a fix for renderQuery - replaces all arguments correctly now, instead of just the first one.

Had to remove the associated test case - can write it up later. (once I learn the testing framework!)
…y strings. Results will still be different between single subject partial matches, but this at least works!
@niravmehta
Copy link
Author

I revised the matchingSubjects function to support auto-complete, which also fixes fts5 errors with special characters in input. That code is here: niravmehta#4 (and not merged into this pull request #163)

Not sure why checks are failing on this pull request.

@missinglink
Copy link
Member

Hi @niravmehta you can view the output of the CI by clicking on the Travis-CI check itself which opens a log such as https://travis-ci.org/pelias/placeholder/jobs/609069753, if you scroll down that log you'll see the failures listed.

I'm a bit concerned about this PR, it's not good practice to delete tests because they start failing, tests are there for a reason and it's important to understand why they changed and what effect this will have on other users before we can merge this to master.

@niravmehta
Copy link
Author

I agree removing test cases is not a good idea. But I do not know the testing framework, and couldn't figure out how to write a proper test case.

Will try again later when I have some free time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants