Why is the order of extracting the contents in the table cells wrong? #1069
xielulu1994
started this conversation in
Ask for help with specific PDFs
Replies: 1 comment
-
Hi @xielulu1994, and thanks for your interest in page.extract_table({ "text_y_tolerance": 5 }) ... although you may have to try adjusting that |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Describe the bug
extract_table() to extract table content, and find that the order of extracted text in individual cells is inconsistent with the original text.
pdf table:
Code to reproduce the problem
PDF file
Please attach any PDFs necessary to reproduce the problem.
If you need to redact text in a sensitive PDF, you can run it through JoshData/pdf-redactor.
Beta Was this translation helpful? Give feedback.
All reactions