Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to change error constructor #444

Open
mysteriouslyseeing opened this issue Dec 2, 2024 · 0 comments · May be fixed by #445
Open

Add option to change error constructor #444

mysteriouslyseeing opened this issue Dec 2, 2024 · 0 comments · May be fixed by #445
Labels
enhancement New feature or request

Comments

@mysteriouslyseeing
Copy link

Following up from discussion in #440:
The current implementation of a token's Error type makes use of the Default constructor. This works fine as a default, but you run into issues if you want the error to reflect the current span of the Lexer, for example, to provide users of a lexer an indication of where exactly there was an error. It's not impossible - currently, a solution is to provide an arbitrary token variant with an attribute like the following:

use logos::Logos;

#[derive(Logos)]
#[logos(error = String)]
enum Token {
    #[token("a", priority = 1)]
    A,
    #[token("b", priority = 1)]
    #[regex(".", callback = |lex| {
        Err::<(), String>(format!("Syntax error at {:?}: unrecognised character '{}'", lex.span(), lex.slice()))
    })]
    B,
}

Note that you have to add priority = 1 to both A and B because "." also matches "a" and "b", and you also have to specify the associated type of the result because Rust cannot infer it.

A solution is to allow users to provide a default constructor for Error, to be used instead of Default::default() with an attribute like #[logos(error_callback = ...)], or something similar. The previous example with this would look like this:

use logos::Logos;

#[derive(Logos)]
#[logos(error = String)]
#[logos(error_callback = |lex| {
    format!("Syntax error at {:?}: unrecognised character '{}'", lex.span(), lex.slice())
})]
enum Token {
    #[token("a")]
    A,
    #[token("b")]
    B,
}
@mysteriouslyseeing mysteriouslyseeing linked a pull request Dec 2, 2024 that will close this issue
@jeertmans jeertmans added the enhancement New feature or request label Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants