You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Words splitted in two lines are counted and POS tagged as two words. An example in BGU 1.2.13, where the first part (ὀ) of the token ὀλίγη is POS tagged as "u" (undefined? but this tag doesn't appear in the list of codes in the Philologic website)
This is a known problem, unfortunately, and documented in the related forthcoming article. Tokenization of such fragmentary, highly marked-up texts poses a huge number of challenges.
Words splitted in two lines are counted and POS tagged as two words. An example in BGU 1.2.13, where the first part (ὀ) of the token ὀλίγη is POS tagged as "u" (undefined? but this tag doesn't appear in the list of codes in the Philologic website)
The text was updated successfully, but these errors were encountered: