Skip to content

How to extract textbox text? #776

Answered by JorjMcKie
jiezhoujenny asked this question in Q&A
Discussion options

You must be logged in to vote

The method is page.getTextbox(rect).
But it seems in your example, some text pieces are not page text, but annotation or widget / field text. That text would of course not be included, because annotations (including fields / widgets) are not part of the page. You can imagine these like "dust" on a picture - can be wiped away without chaging the picture itself.

But you can still extract annotation / field contents, just with another set of methods:
You have to extract annot / field text separately, by first checking which fields / annotations are inside the given rect, and then extracting the text from them.

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
1 reply
@JorjMcKie
Comment options

Answer selected by jiezhoujenny
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants
Converted from issue

This discussion was converted from issue #776 on December 18, 2020 21:27.