Can't get full parse tree without consolidations #180

JoelMahon · 2021-04-02T01:13:13Z

I'd like to parse with a flag (or something of that nature) that results in a FULL parse tree with no consolidations.

By consolidation I mean what occurs in this example:

default_rule = foo
foo = bar
bar = "fizz"

If you parse the string "fizz" with a grammar formed from this PEG your node tree will not contain a single foo or default_rule node as far as I can tell.

The output is this (if I got it right):

<Node matching "fizz">
    <Node called "bar" matching "fizz">

There are also possibly more nodes being missed but I'm less desperate to access them (but I think there should be a flag for them too, either a separate one or included as part of the previously mentioned flag).

foo could have important semantic meaning that is lost, or a visit_foo and this will mean it won't get called (this is the case for my program where I want to highlight all foos with a certain colour but not bars except indirectly when in foos).

I attempted to find where the code does this consolidation but the closest I could find was Node_Visitor.lift_child but overriding that seemed to have no effect and I couldn't see it being used anywhere.

A work around is this:

default_rule = foo ""
foo = bar ""
bar = "fizz"

Parsing fizz we get:

<Node matching "fizz">
    <Node called "default_rule" matching "fizz">
        <Node called "foo" matching "fizz">
            <Node called "bar" matching "fizz">
            <Node matching "">
        <Node matching "">
        <Node matching "">

I get the nodes I want, but unfortunately get some useless ones as well.

The text was updated successfully, but these errors were encountered:

createyourpersonalaccount · 2021-05-09T03:12:43Z

The docstring of Grammar mentions:

parsimonious/parsimonious/grammar.py

Lines 44 to 46 in 3da7e80

    
               * It does all kinds of whizzy space- and time-saving optimizations, like 
        
                 factoring up repeated subexpressions into a single object, which should 
        
                 increase cache hit ratio. [Is this implemented yet?]

I can't spot the exact place where that optimization takes place either. As noted in the docstring,

parsimonious/parsimonious/grammar.py

Lines 38 to 40 in 3da7e80

    
               You could also just construct a bunch of ``Expression`` objects yourself 
        
               and stitch them together into a language, but using a ``Grammar`` has some 
        
               important advantages:

which means you can write your own parser and solve this issue. However, there's a little hack to get exactly what you want with no work:

>>> g = Grammar(
... r"""
... foo = bar / tag_this
... bar = "fizz"
... tag_this = !"" ""   # Never matches, useful for ensuring rule shows up in tree
... """
... )
>>> print(g.parse("fizz"))
<Node called "foo" matching "fizz">
    <Node called "bar" matching "fizz">

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't get full parse tree without consolidations #180

Can't get full parse tree without consolidations #180

JoelMahon commented Apr 2, 2021 •

edited

Loading

createyourpersonalaccount commented May 9, 2021

Can't get full parse tree without consolidations #180

Can't get full parse tree without consolidations #180

Comments

JoelMahon commented Apr 2, 2021 • edited Loading

createyourpersonalaccount commented May 9, 2021

JoelMahon commented Apr 2, 2021 •

edited

Loading