-
Notifications
You must be signed in to change notification settings - Fork 462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flipped text recognition prediction. #1455
Comments
Hi @decadance-dance 👋 Yeah this depends on the crop orientation classifier which isn't 100% robust atm. |
@felixdittrich92, got you, thanks. BTW, maybe you know an easy way to workaround it in my case. My case is I want to get quads (4 pts) instead of rectangles (2 pts) as input of a detector, even if my page is straight. |
Mh could you explain this a bit more in detail ? Because if your images contains only straight text the rectification should not be a problem !? If we talk about some modifications from the detector output in the middle of the pipeline before it's passed to the recognition model -> #1449 could be a helpful solver (Note: input and output signature needs to be the same so conversion from rect to quad in the same pipeline will not work |
@felixdittrich92 All my documents are straight. So I could use |
@felixdittrich92 |
Facing the same issue, if the degree of rotation is below 45 deg there is no real need for 90, 180, 270 corrections, while still wanting to use polygons as output of text detection. |
Bug description
When I set the option assume_straight_pages=False, some of the predictions may be turned upside down.
I tried db_resnet34, db_resnet50 and master, parseg. For each pair I observed this bug.
Code snippet to reproduce the bug
Error traceback
Environment
Collecting environment information...
DocTR version: 0.8.0a0
TensorFlow version: N/A
PyTorch version: 2.1.0a0+4136153 (torchvision 0.16.0a0)
OpenCV version: 4.9.0
OS: Ubuntu 22.04.2 LTS
Python version: 3.10.6
Is CUDA available (TensorFlow): N/A
Is CUDA available (PyTorch): Yes
CUDA runtime version: 12.1.105
GPU models and configuration: GPU 0: NVIDIA A30
Nvidia driver version: 525.147.05
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.9.2
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.9.2
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.9.2
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.9.2
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.9.2
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.9.2
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.9.2
Deep Learning backend
is_tf_available: False
is_torch_available: True
The text was updated successfully, but these errors were encountered: