Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation Fault in add_redact_annot #4047

Closed
TheLastAurora opened this issue Nov 13, 2024 · 4 comments
Closed

Segmentation Fault in add_redact_annot #4047

TheLastAurora opened this issue Nov 13, 2024 · 4 comments
Labels
fix developed release schedule to be determined Fixed in next release

Comments

@TheLastAurora
Copy link

Description of the bug

Calling page.add_redact_annot() in a specific PDF page results in segfault (segmentation fault (core dumped)).

How to reproduce the bug

import pymupdf

from pathlib import Path
import os

p = Path("./pdfs").glob("*")
out_dir = Path("./pdfs/out")
os.makedirs(out_dir, exist_ok=True)
files = [x for x in p if x.is_file()]


def replace_table_text(page: pymupdf.Page) -> pymupdf.Page:
    fontname = page.get_fonts()[0][3]
    if fontname not in pymupdf.Base14_fontnames:
        fontname = "Courier"
    hits = page.search_for("|")
    for rect in hits:
        page.add_redact_annot(
            rect, " ", fontname=fontname, align=pymupdf.TEXT_ALIGN_CENTER, fontsize=10
        )  # Segmentation Fault...
    page.apply_redactions()
    return page


doc = pymupdf.Document(files[0])
replace_table_text(doc[0])

file: test-1-24.pdf

PyMuPDF version

1.24.13

Operating system

Linux

Python version

3.12

@TheLastAurora
Copy link
Author

I confirm that it works in PyMuPDF==1.24.5 and pymupdf4llm==0.0.16

@JorjMcKie
Copy link
Collaborator

What has pymupdf4llm to do with this?

You are aware that you are trying to crack a walnut with a sledgehammer?
You just want to remove the "|" characters, right? Then do not try to write a space in the same place - in addition to requesting that that space should be centered in the available area 😂.

But you probably were just kidding and intended to fool the method with an impossible task.
So thanks for the opportunity to make the method more watertight!

@TheLastAurora
Copy link
Author

What has pymupdf4llm to do with this?

Sorry, I'm using in another part of the code that works fine, so nothing to do with that, just PyMuPDF.

@JorjMcKie JorjMcKie added the fix developed release schedule to be determined label Nov 14, 2024
julian-smith-artifex-com added a commit that referenced this issue Nov 15, 2024
Also added assert that will catch the problem in future.
julian-smith-artifex-com added a commit that referenced this issue Nov 15, 2024
Also added assert that will catch the problem in future.
@julian-smith-artifex-com
Copy link
Collaborator

Fixed in PyMuPDF-1.24.14.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix developed release schedule to be determined Fixed in next release
Projects
None yet
Development

No branches or pull requests

3 participants