This tool processes images from a specified directory or file using a local API for image analysis. It allows users to provide custom prompts and select vision-capable models for generating image descriptions.
- Encodes and processes images using a local API.
- Supports various image formats (e.g., JPG, PNG, BMP, TIFF, HEIC).
- Generates image descriptions using user-defined or default prompts.
- Logs detailed progress with categorized feedback.
- Saves results to timestamped output files.
- Clone the repository:
git clone https://github.com/tristan-mcinnis/Ollama-Image-Processing-CLI-Tool.git
cd Ollama-Image-Processing-CLI-Tool
- Install the required dependencies:
pip install -r requirements.txt
- Ensure your local API server is running at
http://localhost:11434
.
Run the script:
python main.py
-
The default directory for images is
./data
. Create this directory and add your images before running the script. -
Select a vision-capable model and customize the prompt during runtime.
Ensure you have a vision-capable model like llava:latest
installed on your local server.
Processed results are saved in the ./outputs
directory with timestamped filenames.
- Python 3.8+
- Local API server running at
http://localhost:11434
MIT License