Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: IBM Granite 3.1 tool parser fails #11402

Open
1 task done
K-Mistele opened this issue Dec 22, 2024 · 3 comments
Open
1 task done

[Bug]: IBM Granite 3.1 tool parser fails #11402

K-Mistele opened this issue Dec 22, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@K-Mistele
Copy link
Contributor

Your current environment

The output of `python collect_env.py`
Your output of `python collect_env.py` here

Not relevant - running with docker vllm/vllm-openai:v0.6.5

Model Input Dumps

No response

🐛 Describe the bug

The granite tool parser (--tool-call-parser granite) does not seem to be working for IBM Granite 3.1.

Note that this is not related to existing streaming-related bugs; note that stream is set to false; temperature=0 has also been set to maximize reproducability

vLLM Run configuration via docker-compose:

  vllm-9-granite31:
    image: vllm/vllm-openai:v0.6.5
    entrypoint: [
      "vllm", "serve", "ibm-granite/granite-3.1-8b-instruct",
      "--enable-auto-tool-choice", "--enable-chunked-prefill", "--enable-prefix-caching",
      "--gpu-memory-utilization", "0.98",
      "--max-model-len", "32768",
      "--tool-call-parser", "granite",
      #"--max-num-batched-tokens", "8096",
      #"--max-num-seqs", "256",
      "--num-scheduler-steps", "1"
    ]

Note --enable-auto-tool-choice and --tool-call-parser granite. This should work per the docs on IBM granite tool calling.

Example request and response:

{
  "model": "ibm-granite/granite-3.1-8b-instruct",
  "messages":[
    {
      "role": "user",
      "content": "What is the weather in Dallas Texas?"
    }
  ],
  "stream": false,
  "temperature": 0,
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_current_weather",
        "description": "Get the current weather in a given location",
        "parameters": {
          "type": "object",
          "properties": {
            "city": {
              "type": "string",
              "description": "The city to find the weather for, e.g. 'San Francisco'"
            },
            "state": {
              "type": "string",
              "description": "the two-letter abbreviation for the state that the city is in, e.g. 'CA' which would mean 'California'"
            },
            "unit": {
              "type": "string",
              "description": "The unit to fetch the temperature in",
              "enum": [
                "celsius",
                "fahrenheit"
              ]
            }
          },
          "required": ["city", "state"]
        }
      }
    }
  ]
}

Output:

{
  "id": "chatcmpl-bea4219d4bf84b08b62c1bedd9de29a4",
  "created": 1734832567,
  "model": "ibm-granite/granite-3.1-8b-instruct",
  "object": "chat.completion",
  "system_fingerprint": null,
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "<tool_call>[{\"arguments\": {\"city\": \"Dallas\", \"state\": \"TX\", \"unit\": \"celsius\"}, \"name\": \"get_current_weather\"}]",
        "role": "assistant",
        "tool_calls": null,
        "function_call": null
      }
    }
  ],
  "usage": {
    "completion_tokens": 41,
    "prompt_tokens": 343,
    "total_tokens": 384,
    "completion_tokens_details": null,
    "prompt_tokens_details": null
  },
  "service_tier": null,
  "prompt_logprobs": null
}

Expected output: a valid tool call completion, for example:

{
  "id": "chatcmpl-d4715468ea864d41ad79072727bc61aa",
  "created": 1734832681,
  "model": "NousResearch/Hermes-3-Llama-3.1-8B",
  "object": "chat.completion",
  "system_fingerprint": null,
  "choices": [
    {
      "finish_reason": "tool_calls",
      "index": 0,
      "message": {
        "content": null,
        "role": "assistant",
        "tool_calls": [
          {
            "function": {
              "arguments": "{\"city\": \"Dallas\", \"state\": \"TX\", \"unit\": \"fahrenheit\"}",
              "name": "get_current_weather"
            },
            "id": "chatcmpl-tool-e33dff3a0d5445caac4ef3895f89ac24",
            "type": "function"
          }
        ],
        "function_call": null
      }
    }
  ],
  "usage": {
    "completion_tokens": 34,
    "prompt_tokens": 424,
    "total_tokens": 458,
    "completion_tokens_details": null,
    "prompt_tokens_details": null
  },
  "service_tier": null,
  "prompt_logprobs": null
}

Seems possibly related to #11039, #11069, and #11307

cc @maxdebayser @tjohnson31415

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.
@K-Mistele K-Mistele added the bug Something isn't working label Dec 22, 2024
@yumc2573
Copy link

+1
i use model qwen2.5-32b-instruct
output:
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "<tool_call>\n{"name": "recommendProduceListToUser", "arguments": {"items": [{"linkId": "1001", "name": "xx", "price": 3.5}, {"linkId": "1002", "name": "xx", "price": 3.5}, {"linkId": "1003"name": "xx", "price": 4.5}, {"linkId": "1004", "name": "xx", "price": 6.5}, {"linkId": "1005", "name": "xx", "price": 7}]}\n</tool_call>",
"tool_calls": []
},
"finish_reason": "stop"
}
]
server log:
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] Error in extracting tool call from response.
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] Traceback (most recent call last):
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 85, in extract_tool_calls
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] json.loads(match[0] if match[0] else match[1])
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] File "/usr/lib/python3.12/json/init.py", line 346, in loads
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] return _default_decoder.decode(s)
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] ^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] File "/usr/lib/python3.12/json/decoder.py", line 338, in decode
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] obj, end = self.raw_decode(s, idx=_w(s, 0).end())
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] File "/usr/lib/python3.12/json/decoder.py", line 354, in raw_decode
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] obj, end = self.scan_once(s, idx)
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] ^^^^^^^^^^^^^^^^^^^^^^
ERROR 12-23 18:04:26 hermes_tool_parser.py:107] json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 2 column 2 (char 2)

@tjohnson31415
Copy link
Contributor

@K-Mistele Thanks for trying out the new model and looking in to this!

The fix for parsing the <tool_call> token came in the #11307 PR that you linked to. This PR was merged the day after the v0.6.5 release of vLLM, so you would need to use the latest code. I just checked with your example request and it worked for me using latest.

@K-Mistele
Copy link
Contributor Author

    image: vllm/vllm-openai:v0.6.5

You will need to use the latest code
i.e. build from main? or is there a -post release?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants