[VLM] Merged multimodal processor for Qwen2-Audio #11303

DarkLight1337 · 2024-12-18T16:11:30Z

Implement #10114 for Qwen2-Audio; also consolidate common code between Ultravox and Qwen2-Audio processing.

I can successfully run the example script with identical results as the previous version of the model.

Signed-off-by: DarkLight1337 <[email protected]>

github-actions · 2024-12-18T16:11:42Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

Signed-off-by: DarkLight1337 <[email protected]>

Isotr0py

LGTM!

Merged multimodal processor for Qwen2-Audio

0164939

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Dec 18, 2024

DarkLight1337 requested a review from ywang96 as a code owner December 18, 2024 16:11

DarkLight1337 requested a review from Isotr0py December 18, 2024 16:11

This was referenced Dec 18, 2024

[RFC]: Multi-modality Support Refactoring #4194

Open

[RFC]: Merge input processor and input mapper for multi-modal models #10114

Open

Fix error when input list is empty

5fed009

Signed-off-by: DarkLight1337 <[email protected]>

DarkLight1337 force-pushed the qwen2-audio-mm-processor branch from 5ca5bac to 5fed009 Compare December 18, 2024 16:35

DarkLight1337 added 6 commits December 18, 2024 16:35

Fix mypy

c0fdda5

Signed-off-by: DarkLight1337 <[email protected]>

Update input definitions

51282c2

Signed-off-by: DarkLight1337 <[email protected]>

Fix kwargs not being passed to processor resulting in warnings

a4325b3

Signed-off-by: DarkLight1337 <[email protected]>

Simplify code

b3f534e

Signed-off-by: DarkLight1337 <[email protected]>

Update

f548efb

Signed-off-by: DarkLight1337 <[email protected]>

Merge branch 'main' into qwen2-audio-mm-processor

79f0023

Isotr0py approved these changes Dec 19, 2024

View reviewed changes

DarkLight1337 enabled auto-merge (squash) December 19, 2024 05:53

DarkLight1337 merged commit 6142ef0 into vllm-project:main Dec 19, 2024
55 checks passed

DarkLight1337 deleted the qwen2-audio-mm-processor branch December 19, 2024 07:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VLM] Merged multimodal processor for Qwen2-Audio #11303

[VLM] Merged multimodal processor for Qwen2-Audio #11303

DarkLight1337 commented Dec 18, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Dec 18, 2024

Isotr0py left a comment

[VLM] Merged multimodal processor for Qwen2-Audio #11303

[VLM] Merged multimodal processor for Qwen2-Audio #11303

Conversation

DarkLight1337 commented Dec 18, 2024 • edited by github-actions bot Loading

github-actions bot commented Dec 18, 2024

Isotr0py left a comment

Choose a reason for hiding this comment

DarkLight1337 commented Dec 18, 2024 •

edited by github-actions bot

Loading