Replies: 1 comment
-
You cannot - except with your own code of course. PyMuPDF is not deciding about block segmentation, this is a result of MuPDF's algorithms. The next MuPDF version 1.25.0 will bring significant improvements here. With a new text extraction option, MuPDF can be asked to search for recognizable page layout segments which will each be turned into a block for PyMuPDF. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Extraction of blocks in version 1.24.6 is perfect. How can I make version 1.24.13 work like 1.24.6? Thank you!
pdf1246-cut.pdf
pdf12413-cut.pdf
Beta Was this translation helpful? Give feedback.
All reactions