Ultravox Support for LoRA #11253

thedebugger · 2024-12-17T06:17:41Z

TODOs:

Check how to support lora if mistral language model is used

github-actions · 2024-12-17T06:18:13Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

thedebugger · 2024-12-17T06:29:06Z

Hi folks, I'm working with @petersalas on this. PR is not complete but wanted to start the discussion as I have some open questions and need some help from vLLM community

vllm/model_executor/models/ultravox.py

thedebugger · 2024-12-17T06:46:06Z

vllm/model_executor/models/ultravox.py

+    supported_lora_modules = [
+        'up_proj', 'down_proj', 'gate_proj', 'v_proj', 'o_proj', 'k_proj',
+        'q_proj'
+    ]


@jeejeelee since you have worked on loras extensively in vllm, i want to get your input. Ultravox uses llama for language model and loads vllm LlamaForCausalLM which supports LoRA. Given that, what should happen in Ultravox to support LoRA?

As I was adding Llama 3.1 supported modules here, I found that vLLM wasn't loading all v_proj, k_proj, etc. Is that because LlamaForCausalLM supports packed module for vkq_proj? Also I didn't find packed_modules in the original LLama 3.1 model. Why does vLLM defines those in LlamaForCausalLM?

Also do i need to use LLMWrapper when initializing the language model to align the prefix so that vLLM can load Peft loras? Like this here: https://github.com/vllm-project/vllm/pull/7199/files#diff-7b8a4e258637b7c94389c745c449c52137d33cf92957f3e5bcb18a0ee204b21bR807

Given that, what should happen in Ultravox to support LoRA?

Treat it as a independent model - you need to add LoRA-related static variables to it

Why does vLLM defines those in LlamaForCausalLM?

Since these are static variables in the model interface

Also do i need to use LLMWrapper

LLMWrapper has been removed, so this probably doesn't need to be added. I'll figure it out later.

Since Ultravox can be run with mixtral, should i also add its supported lora modules?

How does one figure out what are supported lora modules in LlamaForCausalLM because Llama upstream have no packed modules? Just trying to understand how it all works

Ohh I think i have all that code if you check my pr. But vllm isn't activating the module because fully qualified module names don't match as per my previous comment. Does that make sense?

Treat UltraVox as an independent model, and its fully qualified module names differ from LLaMA's, you cannot use LLaMA's LoRA.

Aren't the qualified module names specific to vLLM? AFAICT I have checked peft and I can't find this limitation. I'm writing up the code for peft to verify that. Maybe I'm missing something but it does seem odd to me that I can't apply an existing LLama Lora in Ultravox

@jeejeelee you were right. after further troubleshooting, peft don't apply llama lora on ultravox either - it ignores missing modules silently until the recent version which logs warning if lora is missing model modules. So I ended up creating a transformed lora for ultravox for testing and added code in vllm to log a warning if lora module is missing for a model.

I have cleaned up the PR and added a test case which should give enough coverage. I'm checking few more things but the PR should be ready shortly

@jeejeelee AFAICT LlamaForCasualLM is used when loading mistral models. Is that correct? If yes, this PR should also cover mistral lora modules since PR includes applicable LlamaForCasualLM lora modules..right?

vllm/model_executor/models/ultravox.py

tests/lora/conftest.py

mergify · 2024-12-31T17:48:45Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @thedebugger.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

thedebugger · 2024-12-31T18:16:09Z

tests/lora/conftest.py

+@pytest.fixture(scope="session")
+def llama3_1_8b_ultravox_chess_lora():
+    # ultravox chess lora is result of transformation of above chess llama lora
+    return snapshot_download(repo_id="thedebugger11/ultravox-chess-lora")


generated using this script: https://github.com/thedebugger/ultravox-lora-scipts/blob/main/peft_ultravox_transform.py

This is done to align module names in the safetensor adapter file base_model.model -> base_model.model.language_model

So if anyone wants to use llama lora with ultravox, it can do that with this after doing this transformation

Signed-off-by: Sumit Vij <[email protected]>

WIP: lora tests Minor tweaks Moar fixes Temp changes Cleanup Add more debugging logs and packed modules Signed-off-by: Sumit Vij <[email protected]>

Remove stale comment Add llama lora modules Add llama test case Add test case and log warning on missing lora modules Rollback unwanted changes and format fixes Signed-off-by: Sumit Vij <[email protected]>

jeejeelee · 2025-01-02T10:13:39Z

Can you refer to #10022 to minimize the changes?

thedebugger · 2025-01-02T13:37:05Z

Can you refer to #10022 to minimize the changes?

Changes are inline with 10022 except the test case and other minor logging changes. Do you have any concerns with any particular change?

jeejeelee · 2025-01-02T15:05:29Z

It looks like there are issues with both the added tests and logs. We should only modify the Ultravox scipt, following the changes made in the #10022

thedebugger · 2025-01-02T16:48:48Z

What is/are the issue(s)? Maybe I miss something but tests are passing

thedebugger · 2025-01-04T13:13:02Z

@jeejeelee lmk what are your concerns please? Happy to address it. Having a test case was super helpful to make sure LoRA works as expected with llama and ultravox

DarkLight1337 requested a review from jeejeelee December 17, 2024 06:19

jeejeelee reviewed Dec 17, 2024

View reviewed changes

vllm/model_executor/models/ultravox.py Show resolved Hide resolved

thedebugger commented Dec 17, 2024

View reviewed changes

jeejeelee reviewed Dec 17, 2024

View reviewed changes

vllm/model_executor/models/ultravox.py Outdated Show resolved Hide resolved

vllm/model_executor/models/ultravox.py Outdated Show resolved Hide resolved

thedebugger commented Dec 17, 2024

View reviewed changes

tests/lora/conftest.py Show resolved Hide resolved

mergify bot added the needs-rebase label Dec 31, 2024

thedebugger commented Dec 31, 2024

View reviewed changes

thedebugger force-pushed the svij-ultravox-lora-dec-16 branch 2 times, most recently from 771484d to 64a664f Compare December 31, 2024 18:49

mergify bot removed the needs-rebase label Dec 31, 2024

thedebugger changed the title ~~WIP: Ultravox Support for LoRA~~ Ultravox Support for LoRA Dec 31, 2024

thedebugger marked this pull request as ready for review December 31, 2024 18:52

thedebugger added 3 commits January 1, 2025 09:30

WIP: early draft of lora support in Ultravox

1c55938

Signed-off-by: Sumit Vij <[email protected]>

format fixes

5a6b79f

WIP: lora tests Minor tweaks Moar fixes Temp changes Cleanup Add more debugging logs and packed modules Signed-off-by: Sumit Vij <[email protected]>

Fix lora modules and formatting

3f5996c

Remove stale comment Add llama lora modules Add llama test case Add test case and log warning on missing lora modules Rollback unwanted changes and format fixes Signed-off-by: Sumit Vij <[email protected]>

thedebugger force-pushed the svij-ultravox-lora-dec-16 branch from 64a664f to 3f5996c Compare January 1, 2025 17:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ultravox Support for LoRA #11253

Ultravox Support for LoRA #11253

thedebugger commented Dec 17, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Dec 17, 2024

thedebugger commented Dec 17, 2024 •

edited

Loading

thedebugger Dec 17, 2024

thedebugger Dec 17, 2024

jeejeelee Dec 17, 2024

thedebugger Dec 18, 2024 •

edited

Loading

thedebugger Dec 18, 2024

thedebugger Dec 25, 2024

jeejeelee Dec 25, 2024

thedebugger Dec 25, 2024

thedebugger Dec 31, 2024

thedebugger Jan 1, 2025 •

edited

Loading

mergify bot commented Dec 31, 2024

thedebugger Dec 31, 2024

thedebugger Dec 31, 2024

thedebugger Dec 31, 2024

jeejeelee commented Jan 2, 2025

thedebugger commented Jan 2, 2025

jeejeelee commented Jan 2, 2025

thedebugger commented Jan 2, 2025

thedebugger commented Jan 4, 2025

Ultravox Support for LoRA #11253

Are you sure you want to change the base?

Ultravox Support for LoRA #11253

Conversation

thedebugger commented Dec 17, 2024 • edited by github-actions bot Loading

github-actions bot commented Dec 17, 2024

thedebugger commented Dec 17, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thedebugger Dec 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thedebugger Jan 1, 2025 • edited Loading

Choose a reason for hiding this comment

mergify bot commented Dec 31, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jeejeelee commented Jan 2, 2025

thedebugger commented Jan 2, 2025

jeejeelee commented Jan 2, 2025

thedebugger commented Jan 2, 2025

thedebugger commented Jan 4, 2025

thedebugger commented Dec 17, 2024 •

edited by github-actions bot

Loading

thedebugger commented Dec 17, 2024 •

edited

Loading

thedebugger Dec 18, 2024 •

edited

Loading

thedebugger Jan 1, 2025 •

edited

Loading