Workflow Guide preprocessing

Optionally, you can start off your workflow by enhancing your images, which can be vital for the following binarization. In this processing step, the raw image is taken and enhanced by e.g. grayscale conversion, brightness normalization, noise filtering, etc.

Note: ocrd-preprocess-image can be used to run arbitrary shell commands for preprocessing (original or derived) images, and can be seen as a generic OCR-D wrapper for many of the following workflow steps, provided a matching external tool exists. (The only restriction is that the tool must not change image size or the position/coordinates of its content.)

Available processors

Processor	Parameter	Remark	Call
ocrd-im6convert	`-P output-format image/tiff`	for `output-options` see IM Documentation	`ocrd-im6convert -I OCR-D-IMG -O OCR-D-ENH -P output-format image/tiff`
ocrd-preprocess-image	`-P input_feature_filter binarized` `-P output_feature_added binarized` `-P command "scribo-cli sauvola-ms-split '@INFILE' '@OUTFILE' --enable-negate-output"`	for parameters and command examples (presets) see the Readme	`ocrd-preprocess-image -I OCR-D-IMG -O OCR-D-PREP -P output_feature_added binarized -P command "scribo-cli sauvola-ms-split @INFILE @OUTFILE --enable-negate-output"`
ocrd-skimage-normalize			`ocrd-skimage-normalize -I OCR-D-IMG -O OCR-D-NORM`
ocrd-skimage-denoise-raw			`ocrd-skimage-denoise-raw -I OCR-D-IMG -O OCR-D-DENOISE`

Notes on parameter usage

E.g.

which parameters do you use with what values?
which parameters are insufficiently documented?
which aspects of a processor should be parameterizable but are not?

Notes on document-specific usage

E.g. which processors worked best with what material? -- feel free to post sample images here, too.

Welcome to the OCR-D wiki, a companion to the OCR-D website.

Articles and tutorials

Discussions

Expert section on OCR-D- workflows

Particular workflow steps

Recommended workflows

Successful Workflows for Particular Material (Template)

Workflow Guide

Videos

Section on Ground Truth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Workflow Guide preprocessing

Available processors

Notes on parameter usage

Notes on document-specific usage

Clone this wiki locally