I'm not great with coding and I don't know how to get pdfplumber to work. #1058

CookieSalesman · 2023-12-09T07:54:56Z

CookieSalesman
Dec 9, 2023

I finally managed to install pdfplumber (it's a miracle that it installed at all). I fixed my python instances.

Now I'm trying to actually USE pdfplumber. I truly do not understand.

I'm kind of in the early days of learning python and I hardly know much about what's going on.

This is what I tried and thought would work

`import pdfplumber

with pdfplumber.open("F:\Files\GPO-CDOC-116sdoc1-1.pdf") as pdf:
first_page = pdf.pages[0]
print(first_page.chars[0])`

it gave me information about page one, it seems. it didn't really spit out a lot of the data I was looking for.

For reference, these are the pdf's i'm looking to ingest.

It didn't seem like the pdfplumber main page gave me much of a way to extract text? And I really did try to search similar discussions. I would use that help if I could understand what they were saying xP

Thanks

jsvine · 2023-12-21T20:13:36Z

jsvine
Dec 21, 2023
Maintainer

Hi @CookieSalesman, and best of luck with your journey learning Python. If you're looking to extract text, check out the Extracting text section of the documentation. For your particular use-case, this should print the text on the first page of your PDF:

import pdfplumber

with pdfplumber.open("F:\Files\GPO-CDOC-116sdoc1-1.pdf") as pdf:
  first_page = pdf.pages[0]
  print(first_page.extract_text())

If the output doesn't appear exactly as you hoped, you may want to try adjusting the parameters to .extract_text(...) that are documented in the link above. For instance, .extract_text(layout=True) will attempt to position the text as it appears on the page.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I'm not great with coding and I don't know how to get pdfplumber to work. #1058

{{title}}

Replies: 1 comment

{{title}}

Select a reply

I'm not great with coding and I don't know how to get pdfplumber to work. #1058

CookieSalesman Dec 9, 2023

Replies: 1 comment

jsvine Dec 21, 2023 Maintainer

CookieSalesman
Dec 9, 2023

jsvine
Dec 21, 2023
Maintainer