Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCR on Product Photos #17

Open
cod3monk opened this issue Nov 3, 2022 · 1 comment
Open

OCR on Product Photos #17

cod3monk opened this issue Nov 3, 2022 · 1 comment

Comments

@cod3monk
Copy link
Contributor

cod3monk commented Nov 3, 2022

Goal: make items easier findable, without having to manually describe them in detail.

Idea:

  1. Run OCR on all photos and extract text
  2. Store extracts with photo
  3. Extend item search to include photo OCR-extracts

Things to consider:

  • Allow this process to be done image-by-image, so that it can be improved in future
  • Primary goal would be to run this process in batch, possibly outside of django, but if the implementation is capable of doing the same live on newly uploaded photos this would be a nice feature
  • Test cases to determine quality of extracts would be good to have, e.g. compare automatic extract to manual extractions
  • Consider comparing multiple OCR systems
  • Also extract and store EANs or other barcodes present in photos
@danieloeh
Copy link
Contributor

So far, i have implemented a basic prototype of this feature which lets you run OCR on all images via python manage.py ocr. It uses pytesseract for the OCR. The result is shown below the description of each image in the "Update Item" view.

Feature branch: https://github.com/danieloeh/inventory_management/tree/feature/ocr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants