-
Notifications
You must be signed in to change notification settings - Fork 35
Lexer post processing
CSLY allows to rework the lexer token stream output to add/modify/remove tokens from the stream.
A basic example could be an expression parser for which we would like to be able to parse
2 x
as 2 * x
There is 2 ways to do so :
-
tweak the parser but you will have hard time managing associativity and precedence.
-
insert implicit
*
tokens just after the lexing phase so the token stream will lookINT(2) TIMES IDENTIFIER(x)
instead ofINT(2) IDENTIFIER(x)
and the parser will not have to manage the missingTIMES
token
lexer post processor is a mere delegate
List<Token> LexerPostProcess(List<Token> tokens)
the delegate takes the raw token stream and returns a modified token stream.
The lexer post processor for the expression parser [PostProcessedLexer](csly/PostProcessedLexerParserBuilder.cs at dev · b3b00/csly · GitHub)
A lexer post processor can be added when building a lexer or parser :
- parser :
builder.BuildParser(parserInstance, ParserType.EBNF_LL_RECURSIVE_DESCENT, $"{nameof(FormulaParser)}_expressions",
lexerPostProcess: postProcessFormula);
- lexer :
LexerBuilder.BuildLexer<FormulaToken>(lexerPostProcess: postProcessFormula);