Release PyMuPDF-1.23.0 released · pymupdf/PyMuPDF

PyMuPDF-1.23.0 has been released.

Wheels for Windows, Linux and MacOS, and the sdist, are available on pypi.org and can be installed in the usual way, for example:

python -m pip install --upgrade pymupdf

Changes in version 1.23.0 (2023-08-22)

Add method find_tables() to the Page object.

This allows locating tables on any supported document page, and
extracting table content by cell.
New "rebased" implementation of PyMuPDF.

The rebased implementation is available as Python module
fitz_new. It can be used as a drop-in replacement with import fitz_new as fitz.
Python-independent MuPDF libraries are now in a second wheel called
PyMuPDFb that will be automatically installed by pip.

This is to save space on pypi.org - a full release only needs one
PyMuPDFb wheel for each OS.
Bug fixes:
- Fixed #2542
- Fixed #2533
- Fixed #2537
Other changes:
- Dropped support for Python-3.7.
- Fix for wrong page / annot /Contents cleaning.
  
  We need to set pdf_filter_options::no_update to zero.
- Added new function get_tessdata().
- Cope with problem /Annot arrays.
  
  When copying page annotations in method Document.insert_pdf we
  previously did not check the validity of members of the /Annots
  array. For faulty members (like null or non-dictionary items) this
  could cause unnecessary exceptions. This fix implements more checks
  and skips such array items.
- Additional annotation type checks.
  
  We did not previously check for annotation type when getting /
  setting annotation border properties. This is now checked in
  accordance with MuPDF.
- Increase fault tolerance.
  
  Avoid exceptions in method insert_pdf() when source pages contains
  invalid items in the /Annots array.
- Return empty border dict for applicable annots.
  
  We previously were returning a non-empty border dictionary even for
  non-applicable annotation types. We now return the empty dictionary
  {} in these cases. This requires some corresponding changes in the
  annotation .update() method, namely for dashes and border width.
- Restrict set_rect to applicable annot types.
  
  We were insufficiently excluding non-applicable annotation types
  from set_rect() method. We now let MuPDF catch unsupported
  annotations and return False in these cases.
- Wrong fontsize computation in page.get_texttrace().
  
  When computing the font size we were using the final text
  transformation matrix, where we should have taken span->trm
  instead. This is corrected here.
- Updates to cope with changes to latest MuPDF.
  
  pdf_lookup_anchor() has been removed.
- Update fill_textbox to better respect rect.width
  
  The function norm_words in fill_textbox had a bug in its last
  loop, appending n+1 characters when actually measuring width of n
  characters. It led to a bug in fill_texbox when you tried to write
  a single word mostly composed of "wide" letters (M,m, W, w...),
  causing the written text to exceed the given rect.
  
  The fix was just to replace n+1 by n.
- Add script_focus and script_blur options to widget.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PyMuPDF-1.23.0 released