[Bug]: IBM Granite 3.1 tool parser fails #11402

K-Mistele · 2024-12-22T02:03:00Z

Your current environment

The output of `python collect_env.py`

Your output of `python collect_env.py` here

Not relevant - running with docker vllm/vllm-openai:v0.6.5

Model Input Dumps

No response

🐛 Describe the bug

The granite tool parser (--tool-call-parser granite) does not seem to be working for IBM Granite 3.1.

Note that this is not related to existing streaming-related bugs; note that stream is set to false; temperature=0 has also been set to maximize reproducability

vLLM Run configuration via docker-compose:

  vllm-9-granite31:
    image: vllm/vllm-openai:v0.6.5
    entrypoint: [
      "vllm", "serve", "ibm-granite/granite-3.1-8b-instruct",
      "--enable-auto-tool-choice", "--enable-chunked-prefill", "--enable-prefix-caching",
      "--gpu-memory-utilization", "0.98",
      "--max-model-len", "32768",
      "--tool-call-parser", "granite",
      #"--max-num-batched-tokens", "8096",
      #"--max-num-seqs", "256",
      "--num-scheduler-steps", "1"
    ]

Note --enable-auto-tool-choice and --tool-call-parser granite. This should work per the docs on IBM granite tool calling.

Example request and response:

{
  "model": "ibm-granite/granite-3.1-8b-instruct",
  "messages":[
    {
      "role": "user",
      "content": "What is the weather in Dallas Texas?"
    }
  ],
  "stream": false,
  "temperature": 0,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {
              "type": "string",
              "description": "The city to find the weather for, e.g. 'San Francisco'"
            },
            "state": {
              "type": "string",
              "description": "the two-letter abbreviation for the state that the city is in, e.g. 'CA' which would mean 'California'"
            },
            "unit": {
              "type": "string",
              "description": "The unit to fetch the temperature in",
              "enum": [
                "celsius",
                "fahrenheit"
              ]
            }
          },
          "required": ["city", "state"]
        }
      }
    }
  ]
}

Output:

{
  "id": "chatcmpl-bea4219d4bf84b08b62c1bedd9de29a4",
  "created": 1734832567,
  "model": "ibm-granite/granite-3.1-8b-instruct",
  "object": "chat.completion",
  "system_fingerprint": null,
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "<tool_call>[{\"arguments\": {\"city\": \"Dallas\", \"state\": \"TX\", \"unit\": \"celsius\"}, \"name\": \"get_current_weather\"}]",
        "role": "assistant",
        "tool_calls": null,
        "function_call": null
      }
    }
  ],
  "usage": {
    "completion_tokens": 41,
    "prompt_tokens": 343,
    "total_tokens": 384,
    "completion_tokens_details": null,
    "prompt_tokens_details": null
  },
  "service_tier": null,
  "prompt_logprobs": null
}

Expected output: a valid tool call completion, for example:

{
  "id": "chatcmpl-d4715468ea864d41ad79072727bc61aa",
  "created": 1734832681,
  "model": "NousResearch/Hermes-3-Llama-3.1-8B",
  "object": "chat.completion",
  "system_fingerprint": null,
  "choices": [
    {
      "finish_reason": "tool_calls",
      "index": 0,
      "message": {
        "content": null,
        "role": "assistant",
        "tool_calls": [
          {
            "function": {
              "arguments": "{\"city\": \"Dallas\", \"state\": \"TX\", \"unit\": \"fahrenheit\"}",
              "name": "get_current_weather"
            },
            "id": "chatcmpl-tool-e33dff3a0d5445caac4ef3895f89ac24",
            "type": "function"
          }
        ],
        "function_call": null
      }
    }
  ],
  "usage": {
    "completion_tokens": 34,
    "prompt_tokens": 424,
    "total_tokens": 458,
    "completion_tokens_details": null,
    "prompt_tokens_details": null
  },
  "service_tier": null,
  "prompt_logprobs": null
}

Seems possibly related to #11039, #11069, and #11307

cc @maxdebayser @tjohnson31415

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

yumc2573 · 2024-12-23T10:31:56Z

+1
i use model qwen2.5-32b-instruct
output:
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "<tool_call>\n{"name": "recommendProduceListToUser", "arguments": {"items": [{"linkId": "1001", "name": "xx", "price": 3.5}, {"linkId": "1002", "name": "xx", "price": 3.5}, {"linkId": "1003"name": "xx", "price": 4.5}, {"linkId": "1004", "name": "xx", "price": 6.5}, {"linkId": "1005", "name": "xx", "price": 7}]}\n</tool_call>",
"tool_calls": []
},
"finish_reason": "stop"
}
]
server log:
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] Error in extracting tool call from response.
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] Traceback (most recent call last):
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 85, in extract_tool_calls
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] json.loads(match[0] if match[0] else match[1])
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] File "/usr/lib/python3.12/json/init.py", line 346, in loads
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] return _default_decoder.decode(s)
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] ^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] File "/usr/lib/python3.12/json/decoder.py", line 338, in decode
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] obj, end = self.raw_decode(s, idx=_w(s, 0).end())
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] File "/usr/lib/python3.12/json/decoder.py", line 354, in raw_decode
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] obj, end = self.scan_once(s, idx)
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] ^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 2 (char 2)

tjohnson31415 · 2024-12-23T17:06:31Z

@K-Mistele Thanks for trying out the new model and looking in to this!

The fix for parsing the <tool_call> token came in the #11307 PR that you linked to. This PR was merged the day after the v0.6.5 release of vLLM, so you would need to use the latest code. I just checked with your example request and it worked for me using latest.

K-Mistele · 2024-12-23T21:26:34Z

    image: vllm/vllm-openai:v0.6.5

You will need to use the latest code
i.e. build from main? or is there a -post release?

K-Mistele added the bug Something isn't working label Dec 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: IBM Granite 3.1 tool parser fails #11402

[Bug]: IBM Granite 3.1 tool parser fails #11402

K-Mistele commented Dec 22, 2024

yumc2573 commented Dec 23, 2024

tjohnson31415 commented Dec 23, 2024

K-Mistele commented Dec 23, 2024

[Bug]: IBM Granite 3.1 tool parser fails #11402

[Bug]: IBM Granite 3.1 tool parser fails #11402

Comments

K-Mistele commented Dec 22, 2024

Your current environment

Model Input Dumps

🐛 Describe the bug

Before submitting a new issue...

yumc2573 commented Dec 23, 2024

tjohnson31415 commented Dec 23, 2024

K-Mistele commented Dec 23, 2024