-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
no correction with ocrd-cis-postcorrect #51
Comments
Thanks for reporting. I am having a look. |
It appears that both files are line-segmented. The post-correction needs word-segmented input. |
thanks for your quick reply! I'll try it again with word segments and report back |
I finally tried |
From a quick glance I suspect problems with the profiling. Can you rerun the same command with |
thx a lot for your quick reply! there's the log file |
In order to run our post correction, both our profiler and an according language backend has to be installed on the system. The configuration variable The other way is to use the profiler that is installed in this project's Dockerfile using $ cd path/to/ocrd_cis # Change into ocrd_cis directory.
$ sudo docker build -t ocrd_cis . # Build the ocrd_cis docker image (this will take some time).
$ sudo docker run ocrd_cis /apps/profiler --help # Check the profiler command in the image.
$ echo 'Theyle' | sudo docker -i run ocrd_cis /apps/profiler \
--config /etc/profiler/languages/german.ini \
--sourceFormat TXT --sourceFile /dev/stdin --simpleOutput Then you can write a shellscript that executes The third option is to run the post correction directly from the built docker image. I see that these points are not very clear in the documenation for the post correction. I will improve the documentation to make the configuration of the profiler more clear. |
And I forgot to mention, that the error you are getting is due to a bad profiler configuration. |
thanks for your help! I'm using a native installation of |
If you use a native installation, you need to install the profiler as well. I have little experience with python's installation setup. But maybe it is possible to install the profiler alongside with ocrd_cis. Maybe @kba can help here. |
I'm running
ocrd-cis-postcorrect
on the aligned OCR-output of Calamari and Tesserocr. So far, the output seems to be completely identical with the input even though there are quite some differences between the results of the two OCR engines. See e.g. the attached example.postcorrect.zip
How can I achieve some correction results?
The text was updated successfully, but these errors were encountered: