Skip to content

This repository contains the source code for the software, which is designed to provide a user-friendly interface for preprocess text images and a fast and accurate OCR process.

Notifications You must be signed in to change notification settings

Madushan98/Ocrt-ToolkIt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OCR Toolkit

A software application that utilizes Optical Character Recognition (OCR) technology to recognize text and add advanced pre-processing techniques to extract text from images with high accuracy. Our tool employs techniques such as skewing, rotating, and noise cancelling to enhance images before performing OCR, resulting in reliable and precise text extraction.

Features

  • Fast and accurate OCR processing
  • User-friendly interface
  • Convert scanned PDFs into searchable and editable documents
  • Supports multiple languages
  • Ability to batch process multiple PDFs at once

Getting Started

These instructions will help you set up the OCR Based PDF Reader on your local machine for development and testing purposes.

Prerequisites

Installation

  1. Clone the repository:

  2. Install the dependencies: run the requriment.txt file in the backend

pip install requirment.txt 

install dependencies in frontend

npm install 
  1. Run the application:
  • backend
uvicorn main:app --reload
  • frontend
    npm run dev
    # or
    yarn dev 

Contributing

We welcome contributions to this project. If you have an idea for a feature or a bug fix, please open a pull request.

License

This project is licensed under the MIT License.

About

This repository contains the source code for the software, which is designed to provide a user-friendly interface for preprocess text images and a fast and accurate OCR process.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages