This Git repository provides an example of how to perform Optical Character Recognition (OCR) using Python libraries pytesseract and OpenCV.
OCR is the process of electronically extracting text from images or scanned documents. In this repository, we use pytesseract and OpenCV to extract text from an image and save it to a text file.
Pytesseract is a Python wrapper for Google's Tesseract-OCR Engine, which can be used to recognize a wide variety of fonts and languages. OpenCV is an open-source computer vision and machine learning software library, which can be used for image processing tasks such as image enhancement, image segmentation, and more.
To use this repository, you'll need to have Python installed on your computer. You'll also need to install the following libraries:
- pytesseract
- opencv-python
This project is used to extract information from chemistry data books.