Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

VikParuchuri / surya Public

Notifications You must be signed in to change notification settings
Fork 960
Star 15k

Code
Issues 100
Pull requests 9
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Releases: VikParuchuri/surya

Releases Tags

Releases · VikParuchuri/surya

Faster text detection + layout

12 Jul 16:06

VikParuchuri

v0.4.15

03b859e

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

Faster text detection + layout

Switched model architecture for the text detection and layout models:

30% faster on GPU
4x faster on CPU
12x faster on MPS (M series macs)

Accuracy should be about the same, or slightly better, from my benchmarks.

Assets 2

socratic-irony, quythanh, styrowolf, xiaominghero, ZhengRui, VipinVIP, shividhar, tomcotter7, ksxkq, abclution, and harsha20032020 reacted with hooray emoji

ZhengRui, ashwanthkumar, shividhar, mateusnobre, shwu-nyunai, harsha20032020, and azizahtas reacted with heart emoji

All reactions

🎉 11 reactions
❤️ 7 reactions

15 people reacted

v0.4.14: Merge pull request #141 from VikParuchuri/dev

30 Jun 14:39

VikParuchuri

v0.4.14

f7c6c04

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

v0.4.14: Merge pull request #141 from VikParuchuri/dev

New transformers version added a new kwarg to donut embeddings. This now handles and ignores that kwarg, and also slightly future-proofs in case this happens again.

Assets 2

kelechi-c and MassChargeRatio reacted with rocket emoji

All reactions

🚀 2 reactions

2 people reacted

Minor bugfixes

28 May 21:44

VikParuchuri

v0.4.12

c5f5e77

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

Minor bugfixes

Fix rotation and copy bugs

Assets 2

All reactions

Fix image bugs

28 May 21:16

VikParuchuri

v0.4.11

53135d0

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

Fix image bugs

Fix bugs with RGBA images
Fix assert bug
Add back in thumbnail method for resizing
Slightly optimize segformer code

Assets 2

All reactions

Change image resize

28 May 02:55

VikParuchuri

v0.4.10

d167369

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

Change image resize

Image resize from cv2 to PIL - cv2 caused benchmark regressions

Assets 2

kelechi-c reacted with rocket emoji

All reactions

🚀 1 reaction

1 person reacted

OCR speedups

27 May 21:56

VikParuchuri

v0.4.9

31e36e7

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

OCR speedups

Speed up base OCR model ~15-20%, and reduce memory usage by ~25% (can do higher batch sizes)
Add static cache for compilation - torch.compile will result in another 15% speedup
Other optimizations, like faster image resizing
Bugfixes, like enabling different length language inputs for OCR (batching different docs with different languages together)

Assets 2

651961, kelechi-c, Josephrp, david-nikolai-mueller, and Gbillington1 reacted with heart emoji

All reactions

❤️ 5 reactions

5 people reacted

Processor improvements

23 May 23:12

VikParuchuri

v0.4.8

80889bd

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

Processor improvements

Remove unneeded format conversions
Fix bug in OCR, where only one color channel was used for OCR - results should be better now
Speed up layout/text detection a bit

Assets 2

mesutde, kelechi-c, hopez13, Jamalianpour, and hyotaime reacted with thumbs up emoji

All reactions

👍 5 reactions

5 people reacted

OCR speedup

18 May 04:03

VikParuchuri

v0.4.7

74e8c0c

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

OCR speedup

Cut OCR time in half. Combined with the previous release, OCR should now take about 40% as much time as it did before.

Assets 2

styrowolf, xtyrrell, jetjodh, agfianf, and aravind-selvam reacted with hooray emoji

kelechi-c and xtyrrell reacted with heart emoji

tcluri and xtyrrell reacted with rocket emoji

All reactions

🎉 5 reactions
❤️ 2 reactions
🚀 2 reactions

7 people reacted

Significant speedup for layout, line detection

17 May 22:04

VikParuchuri

v0.4.6

7a65c45

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

View all tags

Significant speedup for layout, line detection

Improve CPU postprocessing for line detection and layout - cut postprocessing time to 1/3 of original
Unpin transformers version after investigating model performance

This should result in an ~2x speedup for layout and text detection. The effect will be most noticeable on GPU. I haven't fully benchmarked, though.

Assets 2