-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
yield increase extraction issue #157
Comments
@IkeKobby try to make a dependency (syntax-based) rule with increase trigger with a new label eg YieldIncrease, variable:Yield and value:Quantity. If doesn't work, we can try it together in our Monday meeting. |
Alright sure. |
I think this issue can be closed now @maxaalexeeva right? |
@IkeKobby let's keep it open for a bit while I try to debug the issue with losing a yield amount and until we have this as a pr. |
@MihaiSurdeanu @kwalcock a quick question: other than habitus actions, is there anything in the extraction system that could filter out mentions (e.g., based on overlap of two mentions)? The issue in a nutshell: I modify a rule a bit (adding a negative lookbehind), and one mention in a sentence with two mentions stops showing up. When it's a chunk of text without the other mention, the extraction is extracted correctly. |
Alright sure. |
Not that I recall. But I think we added such an action a while ago (that removes overlapping mentions) to habitus. Do you see it? |
This code sounds suspicious: /** Keeps a belief mention only if it is not contained in another */
def keepLongestMentions(mentions: Seq[Mention]): Seq[Mention] = {
|
@MihaiSurdeanu @kwalcock thanks! I tried to disable those but the test in question still fails (described below)---although the extraction seems fine in the shell. Here's some more details on the issue: In this rule, when I add the negative lookbehind at the beginning, this test fails (the The negative lookbehind there is to make sure this test passes - there should only be a yield increase mention there, not yield amount. |
I can trace through the code and try to find out what is happening, but probably not until this afternoon. Thanks for the detailed links to specific lines of code in specific commits. It's greatly appreciated. |
@kwalcock thank you! It might be clear from the commits, but this is the name of the branch |
I haven't been able to find the problem yet. The sentence is
With the simplified rule
two mentions are found.
just the first is found.
the second is found. When it can start with either,
only the first is found again instead of both. |
@kwalcock hm... I experimented with adding things before yield (neg look behind, token with lemma not equal increase/decrease, and just any token) and all of those eliminated the second extraction. I thought it had to be something to do with some overlap I create by adding another token to the mention, but a) I can't find where it matters and b) negative look behind doesn't really add a token .. |
I haven't been able to find the problem, either. I don't think it's anything in the actions. I've done things like replace most words with "fill" and change the sentence length. The lookbehind at the beginning seems to work if that |
@kwalcock, ah, those ( |
@marcovzla, with all your knowledge of the inner workings of Odin, do you have a guess as to why this might not be working as intended? I edited the test file (TestVariableReaderLabelBased.scala) in the branch kwalcock/debugRule so that only the failing test runs, in case that helps. I'm about desperate enough to try to hot wire org.clulab.odin.impl.Result (or something similar) to include a record of how matches are found so that it can be inspected afterwards. I have a feeling you can point out something to which we'll have to say, "Well, duh, that should have been obvious." |
yield increase being extracted as a total yield currently in habitus; eg
..... yield increased for about 1.5 t ha-1 in the ... season after the application of XYZ fertilizer
. How do we best handle this yield increase from being extracted as total yield from sentences? @maxaalexeevaThe text was updated successfully, but these errors were encountered: