Releases: VikParuchuri/surya
Releases · VikParuchuri/surya
Revert thread changes
There were some issues with threading on certain devices. Will re-release after fixing.
Layout and text detection speedup
Overlap postprocessing with inference.
- 20% text detection speedup
- 30% layout speedup
Fix table recognition bug
There was an issue with columns not being detected properly
Fix bug with MPS and PyTorch 2.5
This caused the table rec and OCR models to crash on MPS. Bug is now fixed.
Misc bugfixes
- Fix issue with loading from folders
- Bump pdftext version
- Fix transformers warning
v0.6.3
Bump minimum python version to 3.10, update other packages.
Refactor cell assignment
- Move cell assignment logic into a separate library I'm creating, tabled
- Improve cell extraction from PDFs
Minor bugfixes
- Small bugfix after the table recognition release
Table recognition model release!
- Add a new table recognition model that detects rows/columns and cells
- Add benchmarks for accuracy and speed (seems to be very accurate wrt to current state of the art open model)
- Improve memory efficiency of layout and text detection (hopefully no more memory leaks)
- Improve resolution handling for layout/text detection/ocr, which should improve accuracy quite a bit
OCR v2
A new version of the OCR model with a custom architecture.
- 20% faster
- Automatic language detection, with support for optional language hints
- Better accuracy on old/noisy documents
- Basic english handwriting support (to be improved soon)