HocrConverter

Create PDFs and plain text from hOCR documents

Changes by C.Holtermann

Original script didn't work for me so I made some changes to make it work for me

My configuration is ocropus 0.7 and tesseract 3.02.02

Included some aspects from the fork of https://github.com/zw/HocrConverter:

Some command line arguments:

For command line parsing and validation I use some external libraries:

Like this the script is rather something to understand the concept.

Maybe it's useful for others trying to understand OCR.

Work in progress.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
HocrConverter.py		HocrConverter.py
LICENSE.txt		LICENSE.txt
README.md		README.md
setup.py		setup.py

Provide feedback