You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
That grammar uses a slightly different format. I'm looking to parse it using parsimonious. My script massages the grammar above to something close to what this module expects. But two issues remain:
cpython uses ':' for rules and you seem to use '='
cpython uses '|' for alternatives and you seem to use '/'
Has anyone looked into reconciling these two and using the package to parse python code itself?
The text was updated successfully, but these errors were encountered:
The syntax is based on the older LL(1) ("pgen") parser, and the same syntax is retained and extended for pegen because, apparently, GvR likes it (source). So : is equivalent to = and | is equivalent to /.
More interesting is that pegen is not a scannerless PEG parser (e.g., note that NAME is not defined by the grammar). It must first tokenize the input, then it uses the PEG rules to parse the tokens. See https://docs.python.org/3/library/token.html for the valid tokens. If you want to parse Python character by character, you'll need to write rules for those tokens as well.
Python has a PEG grammar here:
https://github.com/python/cpython/blob/master/Grammar/python.gram
That grammar uses a slightly different format. I'm looking to parse it using parsimonious. My script massages the grammar above to something close to what this module expects. But two issues remain:
cpython uses ':' for rules and you seem to use '='
cpython uses '|' for alternatives and you seem to use '/'
Has anyone looked into reconciling these two and using the package to parse python code itself?
The text was updated successfully, but these errors were encountered: