-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Normalization issues #642
Comments
Thanks @maxaalexeeva! I'll take a look at these soon. |
@MihaiSurdeanu I started on these today. I will fiddle with them more tomorrow and will let you know if I run into issues. |
@MihaiSurdeanu I have addressed several of the issues (#649). I will need your feedback on the unchecked items here. Thanks! |
Thanks @maxaalexeeva !
Thanks! |
@MihaiSurdeanu, will try! Thanks! Another thing came up: Would you actually also look at the first issue ( I have an example where very similar spans of text get different pos tags:
vs.
And because of that NNP on the last token, the rule below (the one I added the constraint to to solve the first issue) does not extract the complete unit (kg ha-1):
Any thoughts on what to do? Accept the incomplete unit? Or maybe you know a better solution for the first issue? |
Hi Masha, |
@MihaiSurdeanu, the kg part actually does get extracted even with the wrong POS; the issue seems to be the very unique token |
Probably Ok to ignore... This seems too specific. |
in
) mistakenly labeled as measurement:upd 1: attempted to add a constraint here with a negative lookahead (?! [entity = /B-LOC/]), but maybe entities are not available at this point in the pipeline?
upd2: this seems to work (?![tag = /NNP|CD/]), but is there a better solution?
Probably similar to three values in the next example. Maybe extract as one unit and split with an action? (unsure if it will be easy to differentiate these from ranges---which will need to be stay unsplit):
the
needs to be added to the vague year date rule (probably hereprocessors/main/src/main/resources/org/clulab/numeric/dates.yml
Line 79 in 93c4dfe
The text was updated successfully, but these errors were encountered: