-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rule only works when commenting out unrelated rules? #80
Comments
The trace for when line 33 is commented out shows not just applying rule 3 (line 27), but applying output rule 3 (line 27) |
Note also if I just don't include the last word, the rule hits fine. |
So the lookahead is trying to figure out whether to keep branches alive in case more rules might apply. You have So the solution is probably for the lookahead to get smarter and for the last rule to change from The tricky part of this is whether I can fully do that without implementing FST subtraction in lttoolbox (or maybe I should just go ahead and do that...). |
So if I understand correctly it's starting an analysis of Also, I can't change the last rule to My current workaround is to have a higher-level rewrite rule |
IRC:
|
Is there a way to give some info in the trace when this applies? It's quite hard to debug when it happens. E.g. I have rules that do
and they work fine and then I add vcmp into the N rule so I can do
and it works fine and but then I notice the first rule stops working in certain contexts :( Turns out, if there's any verb in the rest of the sentence (doesn't have to be tagged |
Information about what parses are getting discarded and why can be gotten from the |
We're seeing this issue again in sme-smj, e.g. we have rules for Would it be possible to do a final pass after everything is done and just treat all the unmatched lexical units in isolation, so they're at least matched by some single-word rule? |
With sme-smj.rtx.zip:
– isn't this plain wrong? Or am I misunderstanding what "partial parses" means? (In 3, all words have at least one parent, while in branch 4 (which is chosen), the first word has no parent node.) EDIT: It seems the test is
so they're just equal. |
Yeah, I think it's |
So I noticed that simply changing the file to have weights on each rule made it choose the parse that has more parses, and when doing that across a real rule file for sme-smj, it removes some untranslated words from corpus runs. Is there a good reason not to have some "initial" weight for every rule, so it can favour parses that cover more words? (Will it then favour deeper trees as well?) |
Yes, it will slightly favor deeper trees, but given how reduce-reduce conflicts are handled, those are favored already. Perhaps we could add another file-level directive to change the default weight to something positive, since that will indeed improve the situation in many cases. |
mitigates #80 We splice in the outputQueueReparsed instead of just replacing in case the output rule changes the number of LU's output.
mitigates #80 We splice in the outputQueueReparsed instead of just replacing in case the output rule changes the number of LU's output.
mitigates #80 We splice in the outputQueueReparsed instead of just replacing in case the output rule changes the number of LU's output.
got:
expected:
HOWEVER: If I comment out either line 23 or line 33 (the ones marked
!!!
) then it strangely works.But trace shows that those lines are not used (this is without commenting them out, where I get the bad result):
I'm probably missing something obvious but I can't see it?
The text was updated successfully, but these errors were encountered: