Skip to content

Releases: Unstructured-IO/unstructured-inference

0.5.14

18 Aug 20:33
d162c56
Compare
Choose a tag to compare

0.5.14

  • Add TIFF test file and TIFF filetype to test_from_image_file in test_layout

0.5.13

17 Aug 04:08
de19ace
Compare
Choose a tag to compare

0.5.13

  • Fix extracted image elements being included in layout merge

0.5.12

16 Aug 21:14
ae73cf8
Compare
Choose a tag to compare

0.5.12

  • Fix a pdfminer error when using process_data_with_model

0.5.11

16 Aug 07:09
4b7276a
Compare
Choose a tag to compare

0.5.11

  • Add warning when chipper is used with < 300 DPI
  • Use None default for dpi so defaults can be properly handled upstream

0.5.10

11 Aug 08:18
15bbc56
Compare
Choose a tag to compare

0.5.10

  • Implement full-page OCR

0.5.9

10 Aug 00:09
203f7ab
Compare
Choose a tag to compare

0.5.9

  • Handle exceptions from Tesseract

0.5.8

09 Aug 19:42
9a53178
Compare
Choose a tag to compare

0.5.8

  • Add alternative architecture for detectron2 (but default is unchanged)
  • Updates:
Library From To
transformers 4.29.2 4.30.2
opencv-python 4.7.0.72 4.8.0.74
ipython 8.12.2 8.14.0
  • Cache named models that have been loaded

0.5.7

25 Jul 18:21
64870de
Compare
Choose a tag to compare
  • hotfix to handle issue storing images in a new dir when the pdf has no file extension

0.5.6

24 Jul 19:10
06c0057
Compare
Choose a tag to compare
  • Update the annotate and _get_image_array methods of PageLayout to get the image from the image_path property if the image property is None.
  • Add functionality to store pdf images for later use.
  • Add image_metadata property to PageLayout & set page.image to None to reduce memory usage.
  • Update DocumentLayout.from_file to open only one image.
  • Update load_pdf to return either Image objects or Image paths.
  • Warns users that Chipper is a beta model.
  • Exposed control over dpi when converting PDF to an image.
  • Updated detectron2 version to avoid errors related to deprecated PIL reference

0.5.5

07 Jul 13:20
41cb7a7
Compare
Choose a tag to compare
  • Rename large model to chipper
  • Reduced memory usage when working on PDFs
  • Fix issue with table processing
  • Added execution providers for CUDA and TensorRT
  • Warning supression for ONNX inference on empty pages.
  • Updates
Library From To
ruff 0.0.270 0.0.276
mypy 1.3.0 1.4.1
onnxruntime 1.15.0 1.15.1