You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
HDT contains a large number of trees that my scripts have identified as having a greater degree of non-projectivity than is typical in UD. After reviewing the data, I have some questions about how this treebank annotates coordination.
Here is a typical example, sentence hdt-s186616
Die Taktrate soll von 33 auf 133 MHz steigen , die Busbreite von 32 auf 64 Bit .
The clock rate will increase from 33 to 133 MHz and the bus width from 32 to 64 bits.
[English translation via Google Translate]
The annotated arcs are tracking the parallel nature of the construction, Taktrate:Busbreite :: 33:32 :: MHz:Bit.
This naturally creates highly non-projective trees, and the more parallel elements there are the more non-projective they get. Elsewhere in UD I believe this is avoided using orphan relations and/or null elements. Is it then the case that these trees in HDT diverge from the UD guidelines, or am I misunderstanding something?
Disclamer: I don't speak German
The following dev/test sentences in have been flagged by my scripts:
hdt-s108400 (in the dev set)
hdt-s117179 (in the test set)
The following training sentences in have been flagged by my scripts:
hdt-s48421
hdt-s49251
hdt-s58870
hdt-s78636
hdt-s125751
hdt-s131804
hdt-s146911
hdt-s150634
hdt-s156335
hdt-s165292
hdt-s167495
hdt-s176414
hdt-s178065
hdt-s178067
hdt-s178068
hdt-s180686
hdt-s185183
hdt-s186616
hdt-s189013
hdt-s197213
hdt-s200957
My scripts only detect extreme non-projectivity, so this is likely not an exhaustive list.
Some more examples:
hdt-s185183
Intel bietet dazu beispielsweise den i815E B-Step an , ALi den Aladdin Pro5T , SiS den SiS635T und VIA den Apollo Pro133T und Pro 266T .
hdt-s189013
Der Palm m500 soll ein Graustufen-Display haben , der Palm m505 dagegen ein reflektives Farbdisplay , über das bislang nur der Compaq iPaq H3630 verfügt .
The text was updated successfully, but these errors were encountered:
Elsewhere in UD I believe this is avoided using orphan relations and/or null elements. Is it then the case that these trees in HDT diverge from the UD guidelines, or am I misunderstanding something?
You got it right. This is annotation error (I looked at the first example only) and it should be analyzed as gapping, with the help of the orphan relation.
Thank you @dan-zeman for taking a look at this and the other issues I've been filing! It's really helpful to know that I'm not far off-track with languages I don't understand.
For the time being I won't rely on HDT for purposes that need the UD gapping analysis.
HDT contains a large number of trees that my scripts have identified as having a greater degree of non-projectivity than is typical in UD. After reviewing the data, I have some questions about how this treebank annotates coordination.
Here is a typical example, sentence
hdt-s186616
The annotated arcs are tracking the parallel nature of the construction, Taktrate:Busbreite :: 33:32 :: MHz:Bit.
This naturally creates highly non-projective trees, and the more parallel elements there are the more non-projective they get. Elsewhere in UD I believe this is avoided using
orphan
relations and/or null elements. Is it then the case that these trees in HDT diverge from the UD guidelines, or am I misunderstanding something?Disclamer: I don't speak German
The following dev/test sentences in have been flagged by my scripts:
The following training sentences in have been flagged by my scripts:
My scripts only detect extreme non-projectivity, so this is likely not an exhaustive list.
Some more examples:
hdt-s185183
hdt-s189013
The text was updated successfully, but these errors were encountered: