Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search Filter not working as expected #612

Open
Leo310 opened this issue Jan 17, 2024 · 8 comments
Open

Search Filter not working as expected #612

Leo310 opened this issue Jan 17, 2024 · 8 comments

Comments

@Leo310
Copy link

Leo310 commented Jan 17, 2024

Describe the bug

When I execute this code:

console.log('Filtering by sources', sources);
const results = await search(db, {
       where: {
            source: sources,
       },
       10000,
});
console.log('Got results', results);
}

I get the following console output:
image
As you can see the first hit in the results already has a different source attribute (value: "README.md") than the filter (value: "To Add.md").
The filter was completely ignored.

To Reproduce

  1. Initialize Orama as follows:
const recordManagerSchema = {
    id: 'string',
    source: 'string',
    updated_at: 'number',
} as const;

const db = await create({
            schema: recordManagerSchema,
            id: "testdb",
            components: {
                tokenizer: {
                    stemming: true,
                    stemmerSkipProperties: ['id', 'source'],
                },
            },
        });
  1. Insert some data into this db with different sources
  2. execute the code I already provided in the bug description

Expected behavior

As you stated in your docs:
image
I expect to only get hits that include 'To Add.md' in their source attribute.

Environment Info

OS: Mac 14.2.1
Orama: 2.0.1
Running inside Obsidian (electron app)

Affected areas

Search

Additional context

No response

@micheleriva
Copy link
Member

Hi, would you be able to provide a repro via replit/similar tool?

@Leo310
Copy link
Author

Leo310 commented Jan 19, 2024

@micheleriva Here is the repro: https://replit.com/join/xnweszxeca-leodev310
I figured out that the filter doesn't match the exact entries of sources but on substrings of one entry.

I think in my previous example, "README.md" was a hit because it contained the ".md" substring. That's why every record was matching. Not sure if this is the intended behavior. If it is, is there another way to filter on exact matches?

@Leo310
Copy link
Author

Leo310 commented Jan 20, 2024

The numeric filter also doesn't seem to work as expected:
For this code:

const before = 1705758397008;
console.log('Searching for entries indexed before', before, 'in Oramadb', db.data.docs);
const results = await search(await this.db, { where: { indexed_at: { lt: before } }, 100 });
console.log('Results', results);

I get this output:
image
where you can see that the indexed_at values of entries 21 and 22 are smaller than the value of before (1705757860535 < 1705758397008). But I also get no hits even though I am using the "lt" filter on indexed_at.

@AND-TomHarris
Copy link

@Leo310 did you find a solution to this? I am facing the same problem

@Leo310
Copy link
Author

Leo310 commented Jul 19, 2024

@AND-TomHarris Unfortunately not, I ended up using Dexie.js instead of Orama

@fenicento
Copy link

+1, trying to filter for strings with spaces (e.g. "History and geography") gives back hits with substrings matching the query (e.g. every entry containing "and" in the filtered field).

Is there any workaround?

@ColeDCrawford
Copy link

Uf. This is pretty bad.

@allevo
Copy link
Collaborator

allevo commented Oct 15, 2024

Hi all!
Have you tried enum instead of string?
Internally, Orama uses a tokenizer for strings and indexes the tokenized strings. Instead, enum values are treated as they are without manipulations.

Let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants