Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make pdf-lib more graceful like other pdf software #1407

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

emilsedgh
Copy link

@emilsedgh emilsedgh commented Feb 26, 2023

Some broken fields may include annotations in their Kids array that couldn't be found on any pages. Up until now we've been throwing an exception when dealing with such fields. But it appears that other PDF software are more resilient to this and gracefully ignore them.

This commit ensures we'll do the same.

Fixes #967,#1281,#1349

What?

PDF Fields can have Kids in them, each Kid being a WidgetAnnotation. In some rare cases (probably due to bad pdf generators) there are cases that there is a PDFField with a an annotation in it Kids but that annotation doesn't exist on any page. Therefore we don't know on which page it must be rendered.

Why?

PDFForm.findPageForAnnotationRe is responsible for finding the page for a given annotation. Up until now it'd throw an exception if it couldn't find a page.

This means when trying to flatten such PDF's, pdf-lib would throw an error. So basically you have a pdf file that can be opened up with Chrome, Firefox/pdfjs, Acrobat Reader and even pdf-lib. But trying to flatten it using pdf-lib would cause an exception (as seen in #967,#1281,#1349)

How?

This patch makes pdf-lib's Flatten gracefully ignore such cases and render the PDF like other pdf readers.

Testing?

Checklist

  • I read CONTRIBUTING.md.
  • I read MAINTAINERSHIP.md#pull-requests.
  • I added/updated unit tests for my changes.
  • I added/updated integration tests for my changes.
  • I ran the integration tests.
  • I tested my changes in Node, Deno, and the browser.
  • I viewed documents produced with my changes in Adobe Acrobat, Foxit Reader, Firefox, and Chrome.
  • I added/updated doc comments for any new/modified public APIs.
  • My changes work for both new and existing PDF files.
  • I ran the linter on my changes.

that couldn't be found on any pages. Up until now we've been throwing
an exception when dealing with such fields. But it appears that other
PDF software are more resilient to this and gracefully ignore them.

This commit ensures we'll do the same.

Fixes Hopding#967,Hopding#1281,Hopding#1349
@anodynos
Copy link

@emilsedgh please check my comment here

@satyajitnayk
Copy link

Hey @Hopding. It would be great if you could review this PR. It is an important fix & requested by many.

@caub
Copy link

caub commented Nov 12, 2023

cf #1281 too

@jalbertsr
Copy link

Is this planning to be merged? I'm currently using this alternative fork @visaright/pdf-lib only because of this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error when using flatten, Could not find page for PDFRef
5 participants