Zero Shot Image Classifier is a tiny web app I built with Flask and Huggingface that allows anyone to instantly classify images dynamically based on their own set of labels. I built this to learn about computer vision and also to see how robust CLIP is at classifying arbitrary images with arbitrary labels. A good friend of mine demo'd something like this to me a while back before CLIP was cool and it blew my mind, and I wanted to see if I could build something quickly that worked! This app is powered by CLIP, can be run locally, and is super straightforward to run.
- Dynamic Label Input: Users can specify their own set of class labels for image classification. Please make sure they are separated by commas.
- Image Upload: Easy upload interface for users to provide images for classification. By easy I mean bare bones.
- Immediate Results: Classification results are displayed instantly on the same page after image submission. You won't be waiting long if you have a reasonable CPU on your machine. For context I am built and ran this on a 2020 M1 Macbook Pro with 16 GB of Memory.
- Python 3.x
- Clone the Repository and install packages. I recommend doing this inside of a virtual environment
git clone https://github.com/ezishiri/Zero-Shot-Image-Classifier.git cd zero-shot-image-classifier python3 -m venv env source bin/activate/env pip install flask torch transformers Pillow
Run the Application
python3 app.py
Access the web application at http://127.0.0.1:5000/.
The screenshots below demonstrate the CLIP can accurately distinguish between extremely similar categories, as it is able to classify the example image as 'glossier lip gloss' among the options lip balm, lipstick, lip gloss, lip liner, and glossier lip gloss among others.
All credit to OpenAI, and Isaac who opened me up to the cool things you can do with AI!