[Feat] Ollama Image API Support #11

reyna-abhyankar · 2024-12-06T05:37:33Z

We currently support the OpenAI Vision API, in which messages look like this:

messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
          },
        },
      ],
    }
  ],

However, ollama only supports local image paths or Base64 encoded images and seems to break unless queried like so:

messages=[
    {
      'role': 'user',
      'content': 'Whats in this image?',
      'images': [path],
    }
  ],

There's PR 5208 merged in ollama, which should resolve the issue of content being an array instead of a string. However, PR 6680 is currently open for LiteLLM to fix the exact unmarshalling error referenced in #10, so it could be the way they are querying ollama. If this gets merged, we might not need to do anything. Otherwise, we could implement basically the same fix on our end (i.e. flattening the content array, adding an images key, and potentially throwing a more explicit error for web images).

The text was updated successfully, but these errors were encountered:

reyna-abhyankar added the local-model label Dec 6, 2024

reyna-abhyankar self-assigned this Dec 6, 2024

reyna-abhyankar mentioned this issue Dec 6, 2024

[Fix] Add ollama message API support for text #12

Merged

3 tasks

reyna-abhyankar added the frontend label Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] Ollama Image API Support #11

[Feat] Ollama Image API Support #11

reyna-abhyankar commented Dec 6, 2024

[Feat] Ollama Image API Support #11

[Feat] Ollama Image API Support #11

Comments

reyna-abhyankar commented Dec 6, 2024