Releases · Unstructured-IO/unstructured-inference · GitHub

18 Aug 20:33

Klaijan

0.5.14

0.5.14

Add TIFF test file and TIFF filetype to test_from_image_file in test_layout

Assets 2

17 Aug 04:08

cragwolfe

0.5.13

0.5.13

Fix extracted image elements being included in layout merge

Assets 2

16 Aug 21:14

qued

0.5.12

0.5.12

Fix a pdfminer error when using process_data_with_model

Assets 2

16 Aug 07:09

cragwolfe

0.5.11

0.5.11

Add warning when chipper is used with < 300 DPI
Use None default for dpi so defaults can be properly handled upstream

Assets 2

11 Aug 08:18

cragwolfe

0.5.10

0.5.10

Implement full-page OCR

Assets 2

10 Aug 00:09

cragwolfe

0.5.9

0.5.9

Handle exceptions from Tesseract

Assets 2

09 Aug 19:42

cragwolfe

0.5.8

0.5.8

Add alternative architecture for detectron2 (but default is unchanged)
Updates:

Library	From	To
transformers	4.29.2	4.30.2
opencv-python	4.7.0.72	4.8.0.74
ipython	8.12.2	8.14.0

Cache named models that have been loaded

Assets 2

25 Jul 18:21

rbiseck3

0.5.7

hotfix to handle issue storing images in a new dir when the pdf has no file extension

Assets 2

24 Jul 19:10

rbiseck3

0.5.6

Update the annotate and _get_image_array methods of PageLayout to get the image from the image_path property if the image property is None.
Add functionality to store pdf images for later use.
Add image_metadata property to PageLayout & set page.image to None to reduce memory usage.
Update DocumentLayout.from_file to open only one image.
Update load_pdf to return either Image objects or Image paths.
Warns users that Chipper is a beta model.
Exposed control over dpi when converting PDF to an image.
Updated detectron2 version to avoid errors related to deprecated PIL reference

Assets 2

07 Jul 13:20

benjats07

0.5.5

Rename large model to chipper
Reduced memory usage when working on PDFs
Fix issue with table processing
Added execution providers for CUDA and TensorRT
Warning supression for ONNX inference on empty pages.
Updates

Library	From	To
ruff	0.0.270	0.0.276
mypy	1.3.0	1.4.1
onnxruntime	1.15.0	1.15.1

Assets 2