Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify and modify free-text search functionality #157

Open
stanislaw-zakrzewski opened this issue Oct 28, 2024 · 0 comments
Open

Verify and modify free-text search functionality #157

stanislaw-zakrzewski opened this issue Oct 28, 2024 · 0 comments
Assignees
Labels

Comments

@stanislaw-zakrzewski
Copy link
Collaborator

stanislaw-zakrzewski commented Oct 28, 2024

Background

While working on adding searching for curator's comment to the free-text search we encountered multiple issues regarding free-text usability (link to discussion regarding those issues: link). This issue is meant to first make sure that we know how the search works and then adjust it to meet curators' expectations.

Current state

Links:

MongoDB offers full-text search for fields with indexes created for them. Currently we are able to search through:

  • "demographics.occupation"
  • "location.country"
  • "location.admin1"
  • "location.admin2"
  • "location.admin3"
  • "caseReference.sourceUrl"
  • "caseStatus"
  • "comment"

The exact rules for the searching are in the link above, here are some of the more important features:

  • By selecting language to english we are able to use features:
    • Stop-words - words without semantic meaning will be removed from search (full list of stop-words)
    • Stemming - terms are matched by their stemmed version (example: searching for dog will match dogs)
  • We can search for multiple terms at the same time, those terms will be treated with OR operator (example: searching for one two will match any case that contains one OR two)
  • Phrases - we can wrap terms in quotes, those terms will be treated as one uninterrupted term (example: searching for "one two" will match cases with one two and one two three but will not match one, two or one three two). Official documentation states that only one phrase per search is allowed, but we can search for multiple phrases, those will be treated with AND operator (example: searching for "one" and "two" will match one two, one two three and one three two, but will not match one or two)

What we are using the current search for (example: searching for country name)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant