Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backslashes seemingly always get stripped from keyword input. #28

Open
creesch opened this issue Apr 19, 2019 · 5 comments
Open

Backslashes seemingly always get stripped from keyword input. #28

creesch opened this issue Apr 19, 2019 · 5 comments

Comments

@creesch
Copy link

creesch commented Apr 19, 2019

I was exploring using this on an input field where users can also input regular expressions which I wanted to do through use of a keyword re:. I then realized that it seemingly is impossible to use a backslash in keyword input. A single backslash does get removed, escaping it removes both and the same goes for triple escaping.

From a curious glance at the code it seems that the functionality that is supposed to preserve escapes actually removes them in this case.

@nepsilon
Copy link
Owner

Having a re: is a great idea. I'd accept a PR adding such a feature. Backslash removal can be changed as long as we keep passing the existing unit tests.

@creesch
Copy link
Author

creesch commented Apr 19, 2019

This is a bug report not a proposal for change or a feature request.

There is nothing in the documentation about completely stripping backslashes and I simply wanted to use the excisting configuration options to add re as a keyword like so

const options =  {
    tokenize: true, 
    offsets: false, 
    keywords: ['re'], 
}

searchQuery('re:te\st  re:te\\st  re:"te\\st"  re:te\\\st something  else also \\escaped  \\\more \\\evenMore', options)

According to your documentation and code comments here and here this should as far as I can tell result in the first backslash being removed but the second being maintained as it is escaped. Just to be complete I added a few other cases with extra backslashes to see what happens.

If I run this I get the following output.

{
    "text": [
        "something",
        "else",
        "also",
        "escaped",
        "more",
        "evenMore"
    ],
    "re": [
        "test",
        "test",
        "test",
        "test"
    ],
    "exclude": {}
}

Where there are no backslashes left.

as long as we keep passing the existing unit tests.

It seems to me that your unit tests as they are now are not complete in coverage.

@creesch
Copy link
Author

creesch commented Apr 19, 2019

https://github.com/nepsilon/search-query-parser/blob/master/test/test.js#L35

Looks like to me that this test doesn't actually tests respecting escaping escape characters which your actually does attempt to do.

@nepsilon
Copy link
Owner

Got it. Thanks for the detailed explanation. So to confirm, the output of:

const options =  {
    tokenize: true, 
    offsets: false, 
    keywords: ['re'], 
}

searchQuery('re:te\st  re:te\\st  re:"te\\st"  re:te\\\st something  else also \\escaped  \\\more \\\evenMore', options)

Should be:

{
    "text": [
        "something",
        "else",
        "also",
        "escaped",
        "more",
        "evenMore"
    ],
    "re": [
        "test",
        "te\st",
        "te\st",
        "te\\st"
    ],
    "exclude": {}
}

Is that correct? I'm trying to come up with a few test cases to cover this scenario.

@creesch
Copy link
Author

creesch commented Apr 23, 2019

I think so yeah, though looking at your current code I also think it is supposed to keep some backslashes in the text array.

See this line and see this line.

Again, I am just reporting this based on your code and that code seems to suggest that the expectation is that double backslashes should be respected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants