Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search suffix tree implementation #51954

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

hannojg
Copy link
Contributor

@hannojg hannojg commented Nov 4, 2024

Explanation of Change

Reapplying the changes from this reverted PR:

This PR makes sure to fix the following regressions as well:

⚠️ A note on the search results order when comparing this PR with production:

Earlier in this PR I was missing to order the search results. I added this functionality back - however it seems that, even on production, the order/sort function for options isn't stable. That means that depending on the order of options in the input array the output array will compute differently. This is something we probably want to fix in a separate ticket.

Fixed Issues

$ #46591
PROPOSAL:

Tests

Test 1:

  • Open the chat finder page (search icon on the start page when authenticated)
  • Perform searches, confirm that everything is found, make sure to also search for numbers
  • Search for an email address (e.g. [email protected]), make sure it's found. Then search for hannomargeloio (removing all special chars), and make sure its still found
  • You might want to open the production version and compare your search results with the dev version to make sure you got the similar results (see note at the top)

Test 2:

  • Search for the user's own email address, make sure that self DM results are at the top (if its a fresh user create a few threads)

Test 3:

  • Create a new group with three participants
  • Search for the group name, including the commas, confirm that it shows in the search results

Offline tests

Same as testing steps

QA Steps

Same as testing steps

PR Author Checklist

  • I linked the correct issue in the ### Fixed Issues section above
  • I wrote clear testing steps that cover the changes made in this PR
    • I added steps for local testing in the Tests section
    • I added steps for the expected offline behavior in the Offline steps section
    • I added steps for Staging and/or Production testing in the QA steps section
    • I added steps to cover failure scenarios (i.e. verify an input displays the correct error message if the entered data is not correct)
    • I turned off my network connection and tested it while offline to ensure it matches the expected behavior (i.e. verify the default avatar icon is displayed if app is offline)
    • I tested this PR with a High Traffic account against the staging or production API to ensure there are no regressions (e.g. long loading states that impact usability).
  • I included screenshots or videos for tests on all platforms
  • I ran the tests on all platforms & verified they passed on:
    • Android: Native
    • Android: mWeb Chrome
    • iOS: Native
    • iOS: mWeb Safari
    • MacOS: Chrome / Safari
    • MacOS: Desktop
  • I verified there are no console errors (if there's a console error not related to the PR, report it or open an issue for it to be fixed)
  • I followed proper code patterns (see Reviewing the code)
    • I verified that any callback methods that were added or modified are named for what the method does and never what callback they handle (i.e. toggleReport and not onIconClick)
    • I verified that comments were added to code that is not self explanatory
    • I verified that any new or modified comments were clear, correct English, and explained "why" the code was doing something instead of only explaining "what" the code was doing.
    • I verified any copy / text shown in the product is localized by adding it to src/languages/* files and using the translation method
      • If any non-english text was added/modified, I verified the translation was requested/reviewed in #expensify-open-source and it was approved by an internal Expensify engineer. Link to Slack message:
    • I verified all numbers, amounts, dates and phone numbers shown in the product are using the localization methods
    • I verified any copy / text that was added to the app is grammatically correct in English. It adheres to proper capitalization guidelines (note: only the first word of header/labels should be capitalized), and is either coming verbatim from figma or has been approved by marketing (in order to get marketing approval, ask the Bug Zero team member to add the Waiting for copy label to the issue)
    • I verified proper file naming conventions were followed for any new files or renamed files. All non-platform specific files are named after what they export and are not named "index.js". All platform-specific files are named for the platform the code supports as outlined in the README.
    • I verified the JSDocs style guidelines (in STYLE.md) were followed
  • If a new code pattern is added I verified it was agreed to be used by multiple Expensify engineers
  • I followed the guidelines as stated in the Review Guidelines
  • I tested other components that can be impacted by my changes (i.e. if the PR modifies a shared library or component like Avatar, I verified the components using Avatar are working as expected)
  • I verified all code is DRY (the PR doesn't include any logic written more than once, with the exception of tests)
  • I verified any variables that can be defined as constants (ie. in CONST.js or at the top of the file that uses the constant) are defined as such
  • I verified that if a function's arguments changed that all usages have also been updated correctly
  • If any new file was added I verified that:
    • The file has a description of what it does and/or why is needed at the top of the file if the code is not self explanatory
  • If a new CSS style is added I verified that:
    • A similar style doesn't already exist
    • The style can't be created with an existing StyleUtils function (i.e. StyleUtils.getBackgroundAndBorderStyle(theme.componentBG))
  • If the PR modifies code that runs when editing or sending messages, I tested and verified there is no unexpected behavior for all supported markdown - URLs, single line code, code blocks, quotes, headings, bold, strikethrough, and italic.
  • If the PR modifies a generic component, I tested and verified that those changes do not break usages of that component in the rest of the App (i.e. if a shared library or component like Avatar is modified, I verified that Avatar is working as expected in all cases)
  • If the PR modifies a component related to any of the existing Storybook stories, I tested and verified all stories for that component are still working as expected.
  • If the PR modifies a component or page that can be accessed by a direct deeplink, I verified that the code functions as expected when the deeplink is used - from a logged in and logged out account.
  • If the PR modifies the UI (e.g. new buttons, new UI components, changing the padding/spacing/sizing, moving components, etc) or modifies the form input styles:
    • I verified that all the inputs inside a form are aligned with each other.
    • I added Design label and/or tagged @Expensify/design so the design team can review the changes.
  • If a new page is added, I verified it's using the ScrollView component to make it scrollable when more elements are added to the page.
  • If the main branch was merged into this PR after a review, I tested again and verified the outcome was still expected according to the Test steps.

Screenshots/Videos

Android: Native
Screen_Recording_20240927_212736_New.Expensify.Dev.mp4
Android: mWeb Chrome
Screen_Recording_20240927_211510_Chrome.mp4
iOS: Native
Screen.Recording.2024-09-27.at.21.28.56.mov
iOS: mWeb Safari
ScreenRecording_09-27-2024.21-34-17_1.MP4
MacOS: Chrome / Safari
Screen.Recording.2024-09-27.at.21.08.10.mov
MacOS: Desktop
Screen.Recording.2024-09-27.at.21.28.10.mov

@hannojg hannojg mentioned this pull request Nov 4, 2024
48 tasks
@hannojg hannojg marked this pull request as ready for review November 5, 2024 11:01
@hannojg hannojg requested a review from a team as a code owner November 5, 2024 11:01
@melvin-bot melvin-bot bot requested a review from ahmedGaber93 November 5, 2024 11:01
Copy link

melvin-bot bot commented Nov 5, 2024

@ahmedGaber93 Please copy/paste the Reviewer Checklist from here into a new comment on this PR and complete it. If you have the K2 extension, you can simply click: [this button]

@melvin-bot melvin-bot bot removed the request for review from a team November 5, 2024 11:01
@hannojg
Copy link
Contributor Author

hannojg commented Nov 5, 2024

cc @marcaaron since you also reviewed the first PR (and rory is on parental leave)

@ahmedGaber93
Copy link
Contributor

ahmedGaber93 commented Nov 7, 2024

BUG: search result display previous search result above the current search result

  1. login with high traffic account
  2. type marc on search, then remove it (don't close the search modal, just remove it) and type shawn

Expected: search for shawn should display result for shawn only
Current: it displays result for marc and shawn

screenshots

Screenshot 2024-11-06 at 4 43 01 PM

Screenshot 2024-11-06 at 4 42 12 PM


BUG: Ordering result prefer includes over startsWith

screenshots

Screenshot 2024-11-06 at 4 42 46 PM

Screenshot 2024-11-06 at 4 41 55 PM


BUG: Duplicated result

screenshots

Screenshot 2024-11-06 at 5 34 40 PM

@hannojg

@hannojg
Copy link
Contributor Author

hannojg commented Nov 7, 2024

Thanks for testing, let me fix those!

@hannojg hannojg mentioned this pull request Nov 7, 2024
49 tasks
@hannojg
Copy link
Contributor Author

hannojg commented Nov 7, 2024

I split out a few changes here first:

@ahmedGaber93
Copy link
Contributor

Status of fixed issues:

  • BUG: search result display previous search result above the current search result
  • BUG: Ordering result prefer includes over startsWith
  • BUG: Duplicated result

@hannojg
Copy link
Contributor Author

hannojg commented Nov 8, 2024

Thanks for testing! On the ordering bug, i looked into it and came to the conclusion that i first need to cleanup the OptionListUtils a little, its a lot of spaghetti 🍝

How it currently works is:

  1. (getOptions) first convert all reports and users into a different format (the ReportUtils.OptionsData format)
  2. (getOptions) then order all search options
  3. (filterOptions) then filter them
  4. (filterOptions) then apply a second ordering (i think this double ordering we can maybe get rid of in the cleanup)

All of this is happening in like 2 functions and is very coupled.

The reason why the ordering is broken with our SuffixTree is that right now it works like this:

  1. (getOptions) first convert all reports and users into the OptionsData format
  2. (getOptions) then order all search options
  3. BREAKING(FastSearch): filter options using our search tree - the search tree doesn't keep the order in-tact, so its breaking here
  4. (FastSearch) apply the second ordering

My current approach is to cleanup the current code so the search works like this (before applying the changes from this PR):

  1. convert all to OptionsData
    • (unfortunately we have to do this first for multiple reasons... will see if i can make a bigger refactor so that only search results that we currently need to display will be converted which would be better for performance)
  2. filter them
  3. then order them at once

This way the ordering should always give the same results.
[Also it should be more performant to just filter the limited search results, instead of ordering everything at the start.]

(Will keep opening a couple of cleanup PRs today)


Cleanup PRs:

@ahmedGaber93
Copy link
Contributor

Great! We can wait cleanup PRs then retest.

};

// There are certain characters appear very often in our search data (email addresses), which we don't need to search for.
const charSetToSkip = new Set(['@', '.', '#', '$', '%', '&', '*', '+', '-', '/', ':', ';', '<', '=', '>', '?', '_', '~', '!', ' ', ',']);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be a CONST somewhere/do we forsee this list growing?

@marcaaron
Copy link
Contributor

Is this PR on HOLD for #52250?

@ahmedGaber93
Copy link
Contributor

ahmedGaber93 commented Nov 19, 2024

I think #52250 is the only one left on cleanup PRs, then @hannojg will add some changes here to fix the ordering bug as I think.

@marcaaron
Copy link
Contributor

Seems we are almost ready to continue now that the linked PR has been merged?

@hannojg
Copy link
Contributor Author

hannojg commented Nov 25, 2024

Yes, will continue now. A few more cleanups / restructures are needed before but we are getting closer!!

@hannojg
Copy link
Contributor Author

hannojg commented Nov 25, 2024

@hannojg
Copy link
Contributor Author

hannojg commented Dec 2, 2024

Here is the final cleanup PR that will provide a clear separation between filtering and ordering here:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants