-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
no deskewing/orientation in GT #30
Comments
Needs to be specified ASAP in OCR-D/spec. Send PR for page-wise and line-wise rotation to PAGE-XML for upcoming 2019 version. |
Hi all, I might have mentioned in another thread, the full PAGE format collection has a dedicated XML format for this, as deskewing was seen as a pre-processing step that does not need to be reflected in the page content XML. But as it turned out, these other XML formats were never adopted much. |
Hi @chris1010010, thanks for your quick feedback! Are you referring to the 2009 subschema I would like to do the PR myself, but looking more closely, I have trouble interpreting the existing BTW, we just have a discussion on |
Hi @bertsky |
Oh, I see. Thanks for the clarification! I mentioned the additive semantics in the new PR. |
@tboenig can this be closed? |
I don't know how this is supposed to work at all. Usually the images need no deskewing, but when they do, that information is missing in PAGE. (I would at least expect some
orientation
angle in the text regions. Or isBaseline
the place to look for this information?)E.g. in weigel_gnothi02_1618, page
phys_0001
needs to be rotated about-2.0
degrees (clockwise). The effect is also pronounced in the GT annotation itself: it contains coordinates that effectively chop off parts of the glyphs in some corners, e.g. regionTextRegion_1479403414297_29
linetl_1
(chopped "V"), regionTextRegion_1488379719413_342
linetl_22
(chopped "durch ſein") and regionTextRegion_1488379733255_361
(chopped "ſein").The text was updated successfully, but these errors were encountered: