Workflow Guide font style annotation

This processor can determine the font style (e.g. italic, bold, underlined) and font family text recognition results.

ocrd-tesserocr-fontshape can either use existing segmentation or segment on-demand. It can detect the following font styles:

fontSize
fontFamily
bold
italic
underlined
monospace
serif

Note: ocrd-tesserocr-fontshape needs the old, pre-LSTM models to work at all. You can use the pre-installed osd (which is purely rule-based), but there might be better alternatives for your language and script. You can still get the old models from Tesseract's Github repo at the last revision before the LSTM models replaced them, usually under the same name. (Thus, deu.traineddata used to be a rule-based model but now is an LSTM model. deu-frak.traineddata is still only available as rule-based model and was complemented by the new LSTM models frk.traineddata and script/Fraktur.traineddata.) If you do need one of the models that was replaced completely, then you should at least rename the old one (e.g. to deu3.traineddata).

Available processors

Processor	Parameter	Remarks	Call
ocrd-tesserocr-fontshape	`-P model osd -P padding 2`	Download other pre-LSTM models from GitHub	`ocrd-tesserocr-fontshape -I OCR-D-OCR -O OCR-D-OCR-FONT`

Notes on parameter usage

E.g.

which parameters do you use with what values?
which parameters are insufficiently documented?
which aspects of a processor should be parameterizable but are not?

Notes on document-specific usage

E.g. which processors worked best with what material? -- feel free to post sample images here, too.

Welcome to the OCR-D wiki, a companion to the OCR-D website.

Articles and tutorials

Discussions

Expert section on OCR-D- workflows

Particular workflow steps

Recommended workflows

Successful Workflows for Particular Material (Template)

Workflow Guide

Videos

Section on Ground Truth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workflow Guide font style annotation

Available processors

Notes on parameter usage

Notes on document-specific usage

Clone this wiki locally