-
Hi Kaiyu! First thanks again for your valuable work and sharing the code! I was reading your work and noticed that you predict the grammar rules all in one go with a classification layer. I was puzzled by this because my understanding (at least for context free grammars CFGs) is that a re-write rule is sort of "markovian" i.e. from the previous symbol we write it to a next possible symbols based on one single rule - but crucially it is conditioned on the previous non-terminal. However, that doesn't seem to be the case if we have a single classification layer for rules like:
Note I don't mean that the current GRU doesn't take the non-terminal in your work (it does) but I would have thought that if we are on non-terminal NT that would index the allowed/valid weights from
which now I think this truly restricts to valid rules assuming that |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi, Sorry for the late reply! Yes, you're right. Given a nonterminal symbol, only a subset of production rules are valid. During testing, we always choose the rule with the highest predicted score among valid rules. During training, we could either
We tried both and didn't observe a significant difference in the final performance. |
Beta Was this translation helpful? Give feedback.
Hi,
Sorry for the late reply! Yes, you're right. Given a nonterminal symbol, only a subset of production rules are valid. During testing, we always choose the rule with the highest predicted score among valid rules. During training, we could either
We tried both and didn't observe a significant difference in the final performance.