I'm not great with coding and I don't know how to get pdfplumber to work. #1058
CookieSalesman
started this conversation in
Ask for help with specific PDFs
Replies: 1 comment
-
Hi @CookieSalesman, and best of luck with your journey learning Python. If you're looking to extract text, check out the Extracting text section of the documentation. For your particular use-case, this should print the text on the first page of your PDF: import pdfplumber
with pdfplumber.open("F:\Files\GPO-CDOC-116sdoc1-1.pdf") as pdf:
first_page = pdf.pages[0]
print(first_page.extract_text()) If the output doesn't appear exactly as you hoped, you may want to try adjusting the parameters to |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I finally managed to install pdfplumber (it's a miracle that it installed at all). I fixed my python instances.
Now I'm trying to actually USE pdfplumber. I truly do not understand.
I'm kind of in the early days of learning python and I hardly know much about what's going on.
This is what I tried and thought would work
`import pdfplumber
with pdfplumber.open("F:\Files\GPO-CDOC-116sdoc1-1.pdf") as pdf:
first_page = pdf.pages[0]
print(first_page.chars[0])`
it gave me information about page one, it seems. it didn't really spit out a lot of the data I was looking for.
For reference, these are the pdf's i'm looking to ingest.
It didn't seem like the pdfplumber main page gave me much of a way to extract text? And I really did try to search similar discussions. I would use that help if I could understand what they were saying xP
Thanks
Beta Was this translation helpful? Give feedback.
All reactions