Skip to content

Commit

Permalink
Update docs
Browse files Browse the repository at this point in the history
Signed-off-by: DarkLight1337 <[email protected]>
  • Loading branch information
DarkLight1337 committed Dec 16, 2024
1 parent 3fb8b52 commit b4e5eb9
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 23 deletions.
10 changes: 5 additions & 5 deletions docs/source/serving/openai_compatible_server.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,6 @@ We currently support the following OpenAI APIs:
- *Note: `suffix` parameter is not supported.*
- [Chat Completions API](#chat-api) (`/v1/chat/completions`)
- Only applicable to [text generation models](../models/generative_models.rst) (`--task generate`) with a [chat template](#chat-template).
- [Vision](https://platform.openai.com/docs/guides/vision)-related parameters are supported; see [Multimodal Inputs](../usage/multimodal_inputs.rst).
- *Note: `image_url.detail` parameter is not supported.*
- We support two audio content types.
- Support `input_audio` content type as defined [here](https://platform.openai.com/docs/guides/audio?audio-generation-quickstart-example=audio-in).
- Support `audio_url` content type for audio files. Refer to [here](https://github.com/vllm-project/vllm/tree/main/vllm/entrypoints/chat_utils.py#L51) for the exact schema.
- *Note: `parallel_tool_calls` and `user` parameters are ignored.*
- [Embeddings API](#embeddings-api) (`/v1/embeddings`)
- Only applicable to [embedding models](../models/pooling_models.rst) (`--task embed`).
Expand Down Expand Up @@ -209,6 +204,11 @@ The following extra parameters are supported:

Refer to [OpenAI's API reference](https://platform.openai.com/docs/api-reference/chat) for more details.

We support both [Vision](https://platform.openai.com/docs/guides/vision)- and
[Audio](https://platform.openai.com/docs/guides/audio?audio-generation-quickstart-example=audio-in)-related parameters;
see our [Multimodal Inputs](../usage/multimodal_inputs.rst) guide for more information.
- *Note: `image_url.detail` parameter is not supported.*

#### Extra parameters

The following [sampling parameters (click through to see documentation)](../dev/sampling_params.rst) are supported.
Expand Down
4 changes: 4 additions & 0 deletions docs/source/usage/multimodal_inputs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -376,6 +376,10 @@ Then, you can use the OpenAI client as follows:
result = chat_completion_from_base64.choices[0].message.content
print("Chat completion output from input audio:", result)
Alternatively, you can pass :code:`audio_url`, which is the audio counterpart of :code:`image_url` for image input:

.. code-block:: python
chat_completion_from_url = client.chat.completions.create(
messages=[{
"role": "user",
Expand Down
38 changes: 20 additions & 18 deletions examples/openai_chat_completion_client_for_multimodal.py
Original file line number Diff line number Diff line change
Expand Up @@ -153,11 +153,11 @@ def run_multi_image() -> None:

# Audio input inference
def run_audio() -> None:
# Any format supported by librosa is supported
audio_url = AudioAsset("winning_call").url
audio_base64 = encode_base64_content_from_url(audio_url)

# Use audio url in the payload
chat_completion_from_url = client.chat.completions.create(
# OpenAI-compatible schema (`input_audio`)
chat_completion_from_base64 = client.chat.completions.create(
messages=[{
"role":
"user",
Expand All @@ -167,9 +167,11 @@ def run_audio() -> None:
"text": "What's in this audio?"
},
{
"type": "audio_url",
"audio_url": {
"url": audio_url
"type": "input_audio",
"input_audio": {
# Any format supported by librosa is supported
"data": audio_base64,
"format": "wav"
},
},
],
Expand All @@ -178,11 +180,11 @@ def run_audio() -> None:
max_completion_tokens=64,
)

result = chat_completion_from_url.choices[0].message.content
print("Chat completion output from audio url:", result)
result = chat_completion_from_base64.choices[0].message.content
print("Chat completion output from input audio:", result)

audio_base64 = encode_base64_content_from_url(audio_url)
chat_completion_from_base64 = client.chat.completions.create(
# HTTP URL
chat_completion_from_url = client.chat.completions.create(
messages=[{
"role":
"user",
Expand All @@ -195,7 +197,7 @@ def run_audio() -> None:
"type": "audio_url",
"audio_url": {
# Any format supported by librosa is supported
"url": f"data:audio/ogg;base64,{audio_base64}"
"url": audio_url
},
},
],
Expand All @@ -204,9 +206,10 @@ def run_audio() -> None:
max_completion_tokens=64,
)

result = chat_completion_from_base64.choices[0].message.content
print("Chat completion output from base64 encoded audio:", result)
result = chat_completion_from_url.choices[0].message.content
print("Chat completion output from audio url:", result)

# base64 URL
chat_completion_from_base64 = client.chat.completions.create(
messages=[{
"role":
Expand All @@ -217,11 +220,10 @@ def run_audio() -> None:
"text": "What's in this audio?"
},
{
"type": "input_audio",
"input_audio": {
"type": "audio_url",
"audio_url": {
# Any format supported by librosa is supported
"data": audio_base64,
"format": "wav"
"url": f"data:audio/ogg;base64,{audio_base64}"
},
},
],
Expand All @@ -231,7 +233,7 @@ def run_audio() -> None:
)

result = chat_completion_from_base64.choices[0].message.content
print("Chat completion output from input audio:", result)
print("Chat completion output from base64 encoded audio:", result)


example_function_map = {
Expand Down

0 comments on commit b4e5eb9

Please sign in to comment.