PyMuPDF-1.23.0 released
PyMuPDF-1.23.0 has been released.
Wheels for Windows, Linux and MacOS, and the sdist, are available on pypi.org and can be installed in the usual way, for example:
python -m pip install --upgrade pymupdf
Changes in version 1.23.0 (2023-08-22)
-
Add method
find_tables()
to thePage
object.This allows locating tables on any supported document page, and
extracting table content by cell. -
New "rebased" implementation of PyMuPDF.
The rebased implementation is available as Python module
fitz_new
. It can be used as a drop-in replacement withimport fitz_new as fitz
. -
Python-independent MuPDF libraries are now in a second wheel called
PyMuPDFb
that will be automatically installed by pip.This is to save space on pypi.org - a full release only needs one
PyMuPDFb
wheel for each OS. -
Bug fixes:
-
Other changes:
-
Dropped support for Python-3.7.
-
Fix for wrong page / annot
/Contents
cleaning.We need to set
pdf_filter_options::no_update
to zero. -
Added new function get_tessdata().
-
Cope with problem
/Annot
arrays.When copying page annotations in method Document.insert_pdf we
previously did not check the validity of members of the/Annots
array. For faulty members (like null or non-dictionary items) this
could cause unnecessary exceptions. This fix implements more checks
and skips such array items. -
Additional annotation type checks.
We did not previously check for annotation type when getting /
setting annotation border properties. This is now checked in
accordance with MuPDF. -
Increase fault tolerance.
Avoid exceptions in method
insert_pdf()
when source pages contains
invalid items in the/Annots
array. -
Return empty border dict for applicable annots.
We previously were returning a non-empty border dictionary even for
non-applicable annotation types. We now return the empty dictionary
{}
in these cases. This requires some corresponding changes in the
annotation.update()
method, namely for dashes and border width. -
Restrict
set_rect
to applicable annot types.We were insufficiently excluding non-applicable annotation types
fromset_rect()
method. We now let MuPDF catch unsupported
annotations and returnFalse
in these cases. -
Wrong fontsize computation in
page.get_texttrace()
.When computing the font size we were using the final text
transformation matrix, where we should have takenspan->trm
instead. This is corrected here. -
Updates to cope with changes to latest MuPDF.
pdf_lookup_anchor()
has been removed. -
Update fill_textbox to better respect rect.width
The function norm_words in fill_textbox had a bug in its last
loop, appending n+1 characters when actually measuring width of n
characters. It led to a bug in fill_texbox when you tried to write
a single word mostly composed of "wide" letters (M,m, W, w...),
causing the written text to exceed the given rect.The fix was just to replace n+1 by n.
-
Add
script_focus
andscript_blur
options to widget.
-