forked from abiggerhammer/hammer
-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
h_cfgrammar_() leaks memory #182
Comments
I believe, tentatively, that the underlying problem here is that |
Andrea Shepard on Tue, Dec 06 2016:
While valgrinding unit tests and plugging leaks, I found that
`/core/grammar/end` leaks even if I add `h_cfgrammar_free(g)` to
`test_end()`. The cause is the calls to `h_new()` (rather than in the
grammar's arena) in `if (h_hashset_empty(g->nts))` branch of
`h_cfgrammar_()`. Unfortunately, these cannot simply be changed over
to `h_arena_malloc()` calls because the LL(k) backend frees the
grammar once it finishes compiling the parser, but apparently keeps
some of these pointers around and uses them during parsing. Further
investigation proceeding.
I've had a quick look and can shed some light:
The real issue is `h_desugar`, which uses `h_new`. The case in
`h_cfgrammar_` that you mention is just a special case where `h_desugar`
doesn't return the form we want. These objects (`HCFChoice`) are kept
because they represent the nonterminals[*]. To free this memory
properly, we should consider it a transfer of ownership from the grammar
to the table (backend data).
Longer version:
To generate the grammar, the parser is first converted to a
sum-of-products representation by `h_desugar`. The type `HCFChoice`
with tag HCF_CHOICE is the sum and `HCFSequence` is the product. The
base terms are represented by `HCFChoice` with the other tags
(`HCF_CHAR`, `HCF_CHARSET`, etc.).
The case you're looking at in `h_cfgrammar_` is the special case where
`h_desugar` returns a base term and we need to wrap that in a singleton
sequence and a singleton choice to give it the form expected of the
start symbol (nonterminal).
The function `h_desugar` is using `h_new` in the same style the
combinators use it: allocate once, free never. The implicit assumption
being that parsers are constructed and compiled once at the beginning of
the program and deallocated by termination of the program. Don't look at
me, I just got here.
I intended the `h_cfgrammar` routine to be potentially useful
stand-alone, and I don't think we want to burden a user of that function
with the detail of (the awkwardly-named) "desugaring", so I think it's
right to create these objects there. But when the `HCFGrammar` is
created only transiently as part of compiling a table, it can't free the
`HCFChoice`s with it, as discussed.
One simple solution would be to explicitly transfer ownership of the
symbols to the table and only free them in `h_cfgrammar_free` if we
still own them. Or instead of explicitly tracking the ownership,
distinguish between a full and a "half" free of the grammar; let
`h_cfgrammar_free` free everything and give the backend another function
to free only the grammar during compilation and the symbols later, in
its own destructor.
…-pesco
[*]: They contain, for instance, the function pointers for semantic
actions, predicates, and the "reshape" functions.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
While valgrinding unit tests and plugging leaks, I found that
/core/grammar/end
leaks even if I addh_cfgrammar_free(g)
totest_end()
. The cause is the calls toh_new()
(rather than in the grammar's arena) inif (h_hashset_empty(g->nts))
branch ofh_cfgrammar_()
. Unfortunately, these cannot simply be changed over toh_arena_malloc()
calls because the LL(k) backend frees the grammar once it finishes compiling the parser, but apparently keeps some of these pointers around and uses them during parsing. Further investigation proceeding.The text was updated successfully, but these errors were encountered: