You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Made first attempt to convert PDFs to text file with the command line OCR tesseract 3.03. First converted PDF to tiff image then run the PDF through tesseract-ocr. But I get typos on every other line. So doubled density of tiff, and it's worse. Going to try tesseract 3.04 (v3.03 is two years old), build from source, and then work on more tweaking on the tesseract command line. 20160425 Running tesseract 3.04 now. Any error causes segmentation fault so troubleshooting the commandline is challenging me. Hit snag with training files, but feel I may be close to really accurate, machine readable format from the command line.
List of indexes and corresponding years
The text was updated successfully, but these errors were encountered: