Skip to content

Latest commit

 

History

History
65 lines (39 loc) · 1.53 KB

File metadata and controls

65 lines (39 loc) · 1.53 KB

Image Processing CLI Tool

This tool processes images from a specified directory or file using a local API for image analysis. It allows users to provide custom prompts and select vision-capable models for generating image descriptions.

Features

  • Encodes and processes images using a local API.
  • Supports various image formats (e.g., JPG, PNG, BMP, TIFF, HEIC).
  • Generates image descriptions using user-defined or default prompts.
  • Logs detailed progress with categorized feedback.
  • Saves results to timestamped output files.

CLI Showcase

image

Installation

  1. Clone the repository:
git clone https://github.com/tristan-mcinnis/Ollama-Image-Processing-CLI-Tool.git
cd Ollama-Image-Processing-CLI-Tool
  1. Install the required dependencies:
pip install -r requirements.txt
  1. Ensure your local API server is running at http://localhost:11434.

Usage

Run the script:

python main.py

Configuration

  1. The default directory for images is ./data. Create this directory and add your images before running the script.

  2. Select a vision-capable model and customize the prompt during runtime.

Supported Models

Ensure you have a vision-capable model like llava:latest installed on your local server.

Output

Processed results are saved in the ./outputs directory with timestamped filenames.

Requirements

  • Python 3.8+
  • Local API server running at http://localhost:11434

License

MIT License