Document Image Classifier

This is a simple image classifier based on the Inception model to get predictions on whether a supplied image is a document or not.

Getting Started

Install Python dependencies using PIP:

pip install -r requirements.txt

Training

Add training images in the training/images directory like shown below. Due to the usage of the Inception model you can achieve great results with a relatively small data set (~100 images for each category).

├── training
│   ├── images
│   │   ├── documents [your training images]
│   │   └── random    [your training images]

Once we have the training images we can start the process of retraining the Inception model.

scripts/training.sh

Predictions

Notice that this image classifier currently only works with JPEG images.

$ python src/prediction.py <YOUR_IMAGE_URL>
> document (score = 0.99978)
> random (score = 0.00022)

Rest-API (development-only)

For development purposes you can run a simple REST endpoint to serve predictions. For serious production use something like TensorFlow Serving is highly recommended.

export FLASK_APP="api.py"
export FLASK_DEBUG=1

flask run

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
infrastructure		infrastructure
scripts		scripts
src		src
training		training
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document Image Classifier

Getting Started

Training

Predictions

Rest-API (development-only)

About

Releases

Packages

Languages

nagelflorian/document-classification

Folders and files

Latest commit

History

Repository files navigation

Document Image Classifier

Getting Started

Training

Predictions

Rest-API (development-only)

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages