Skip to content
Olivier Duhart edited this page Jan 2, 2020 · 37 revisions

preliminary

An EBNF parser is an extension of a BNF parser. So for better understanding please first refer to BNF parser page

EBNF notation

repeater modifiers

you can use EBNF notation :

  • '*' to repeat 0 or more the same terminal or non terminal
  • '+' to repeat once or more the same terminal or non terminal

for repeated elements values passed to [Production] methods are :

  • List<TOut> for a repeated non terminal
  • List<Token<TIn>> for a repeated terminal
       [Production("listElements: value additionalValue*")]
       public JSon listElements(JSon head, List<JSon> tail)
       {
           JList values = new JList(head);
           values.AddRange(tail);
           return values;
       }

See EBNFJsonParser.cs for a complete EBNF json parser.

option modifier

the '?' modifier allow optional token or non-erminal.

  • for tokens the Token<TIn> parameter has a IsEmpty property set to true when the matching token is absent.
  • for nonterminal the visitor method get an OptionValue<TOut> instead of TOut. Then the parameter can be tested for emptyness with IsNone property.
//option token

   [Production("block: A B? C")]
   public AST listElements(Token<TIn>, a Token<TIn> b, Token<TIn> c)
   {
       if (b.IsEmpty) {
           // do something usefull
       }
       else {
           string bValue = b.Value;
           // do other thing still usefull
       }
   }

// optional non terminal

   [Production("root2 : a B? c ")]
   public string root2(Token<OptionTestToken> a, ValueOption<string> b, Token<OptionTestToken> c)
   {
       StringBuilder r = new StringBuilder();
       r.Append($"R(");
       r.Append(a.Value);
       r.Append(b.Match(v => $",{v}", () => ",<none>"));
       r.Append($",{c.Value}");
       r.Append($")");
       return r.ToString();
   }

groups / sub-rules

You can define groups (also known as sub-rules) in a production rule. A group is a sequence of terminals or non terminals. Modifiers are not allowed within a group, only the discard modifier (see bleow) is allowed on terminals. the matching method parameter for a group is a Group<TIn,TOut>. A Group<TIn,TOut> is a list of Token<IN> or TOut. Values in the Group are listed in the same order as their corresping clauses. Group<TIn,TOut> exposes method to ease access to values.

Groups can be "multiplied" using a modifier. In this case the value returned is a List<Group<TIn,TOut>>

Groups can also be optional using the ? operator. Then the returned value is a ValueOption<Group<IN,OUT>>.

       [Production("listElements: value (COMMA [d] value)* ")]
       public JSon listElements(JSon head, List<Group<JsonToken,JSon>> tail)
       {
           JList values = new JList(head);
           values.AddRange(tail.Select((Group<JsonToken,JSon> group) => group.Value(0)).ToList<JSon>());
           return values;
       }

       [Production("rootOption : A ( SEMICOLON [d] A )? ")]
       public string rootOption(Token<GroupTestToken> a, ValueOption<Group<GroupTestToken, string>> option)
       {
           StringBuilder builder = new StringBuilder();
           builder.Append("R(");
           builder.Append(a.Value);
           var gg = option.Match(
               (Group<GroupTestToken, string> group) =>
               {
                   var aToken = group.Token(0).Value;
                   builder.Append($";{aToken}");
                   return group;
               },
           () =>
           {
               builder.Append(";");
               builder.Append("a");
               var g = new Group<GroupTestToken, string>();
               g.Add("<none>", "<none>");
               return g;
           });            
           builder.Append(")");
           return builder.ToString();
       }

alternate choice

In some case you just don't want to write many production rules when those rules only differ with a single terminal or non terminal clause. For these case you can use the | operator. Alternate choices are grouped together between brackets [ ... ]. a pipe | separate each different choice :

public class AlternateChoiceTestTerminal
    {
        [Production("choice : [ a | b | c]")]
        public string Choice(Token<OptionTestToken> c)
        {
            return c.Value;
        }
    }

⚠️ Warning ! a choice group can only contain terminal or non-terminal and they can not be mixed.

? + and * modifiers are allowed :

public class AlternateChoiceTestTerminal
    {
        [Production("choice : [ a | b | c]*")]
        public string Choice(List<Token<OptionTestToken>> c)
        {
            return c.Value;
        }
    }

ignoring syntax sugar tokens

Sometimes tokens do not bring any semantic value. Their only value is to denotes syntaxic structure.

For example in C like language, brackets ('{') only denotes beginning of blocks but does add any other information. Their only use is to guide the syntax parser. So we proposed a way to dismiss this tokens on the visitor methods.

the [d] (d for discard) modifier marks a token as ignored. [d] modifier only make sens when applied to a token. If applied to a nonterminal it will simply be ignored.

Here is an exemple for a C block statement:

        [Production("block: LBRACKET [d] statement* RBRACKET [d]")]
        public AST listElements( List<AST> statements)
        {
            // any usefull code
        }

under the hood, meta consideration on EBNF parsers

The EBNF notation has been implemented in CSLY using the BNF notation. The EBNF parser builder is built using the BNF parser builder. Incidently the EBNF parser builder is a good and complete example for BNF parser : RuleParser.cs

the full grammar for an EBNF rule is EBNF rules grammar