Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocr result appear other language word #251

Open
zhujun5164 opened this issue Nov 27, 2024 · 2 comments
Open

ocr result appear other language word #251

zhujun5164 opened this issue Nov 27, 2024 · 2 comments

Comments

@zhujun5164
Copy link

Hi

i have set the langs to 'zh' when i using the OCR,but the recognize result appear japanese word or other language word。how can i fix it or limit the ocr result in my word dict.

thx

@zhujun5164
Copy link
Author

code

from PIL import Image
from surya.ocr import run_ocr
from surya.model.detection.model import load_model as load_det_model, load_processor as load_det_processor
from surya.model.recognition.model import load_model as load_rec_model
from surya.model.recognition.processor import load_processor as load_rec_processor
from surya.recognition import batch_recognition

image = Image.open('text.png')
langs = ["en"] # Replace with your languages - optional but recommended
det_processor, det_model = load_det_processor(), load_det_model()
rec_model, rec_processor = load_rec_model(), load_rec_processor()

predictions = batch_recognition([image], [langs], rec_model, rec_processor)
print(predictions)

image
text

output
([' ସଙ୍କିତ '], [0.77392578125])

@EHadoux
Copy link

EHadoux commented Dec 11, 2024

Well, I actually came here to say the same but with english. I deal with financial reports and it often changes the £ with "છ" or "רא" or "મ" or "દ".
I must say in all fairness that the rest is great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants