Skip to content

pokotylo/ocrlayout

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

86 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCRLAYOUT Library

Provides the ability to get more meaninful text out of common OCR outputs. It manipulates the Bounding Boxes of lines to rebuild a page layout to approximate human-reading experience.

Problem Statement

Current OCR engines responses are focus on text recall. Ocrlayout tries to go a step further by re-ordering the lines of text so it'd approach a human-reading behavior.

When images contains a lot of textual information, it becomes relevant to assemble the generated meaninful blocks of text enabling better scenarios.

Another way to see would be to cluster the lines of text based on their positions/coordinates in the original content.

More meaningfull output for what?

  • Text Analytics you may leverage any Text Analytics such as Key Phrases, Entities Extraction with more confidence of its outcome
  • Accessibility : Any infographic becomes alive, overcoming the alt text feature.
  • Read Aloud feature : it becomes easier to build solutions to read aloud an image, increasing verbal narrative of visual information.
  • Machine Translation : get more accurate MT output as you can retain more context.
  • Sentences/Paragraph Classification: from scanned-base images i.e. contracts, having a more meaninful textual output allows you to classify it at a granular level in terms of risk, personal clause or conditions.

OCR Engines Output Support

Today bboxhelper supports the output of

AZURE

GOOGLE

AWS - Detect Document Text

Our goal here is not to conduct a comparison between Azure, Google or AWS Computer Vision API but to provide a consistent way to output OCR text for further processing regardless of the underlying OCR Engine.

BBoxHelper - Get Started

More information to get started can be found documentation of this repository: documentation.

pip install ocrlayout

Recipes

If you need more hands-on examples on how to use this library check out our recipes

Known Limitations

More information on known limitations.

Upcoming improvements

Contributing

This project welcomes contributions and suggestions.

This project has adopted the Microsoft Open Source Code of Conduct.

For more information see the Code of Conduct FAQ.

Disclaimer

THIS CODE IS PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING ANY IMPLIED WARRANTIES OF FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR NON-INFRINGEMENT.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%