Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ultravox Support for LoRA #11253

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

thedebugger
Copy link

@thedebugger thedebugger commented Dec 17, 2024

TODOs:

  • Check how to support lora if mistral language model is used

Copy link

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

  • Add ready label to the PR
  • Enable auto-merge.

🚀

@thedebugger
Copy link
Author

thedebugger commented Dec 17, 2024

Hi folks, I'm working with @petersalas on this. PR is not complete but wanted to start the discussion as I have some open questions and need some help from vLLM community

supported_lora_modules = [
'up_proj', 'down_proj', 'gate_proj', 'v_proj', 'o_proj', 'k_proj',
'q_proj'
]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeejeelee since you have worked on loras extensively in vllm, i want to get your input. Ultravox uses llama for language model and loads vllm LlamaForCausalLM which supports LoRA. Given that, what should happen in Ultravox to support LoRA?

As I was adding Llama 3.1 supported modules here, I found that vLLM wasn't loading all v_proj, k_proj, etc. Is that because LlamaForCausalLM supports packed module for vkq_proj? Also I didn't find packed_modules in the original LLama 3.1 model. Why does vLLM defines those in LlamaForCausalLM?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also do i need to use LLMWrapper when initializing the language model to align the prefix so that vLLM can load Peft loras? Like this here: https://github.com/vllm-project/vllm/pull/7199/files#diff-7b8a4e258637b7c94389c745c449c52137d33cf92957f3e5bcb18a0ee204b21bR807

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that, what should happen in Ultravox to support LoRA?

Treat it as a independent model - you need to add LoRA-related static variables to it

Why does vLLM defines those in LlamaForCausalLM?

Since these are static variables in the model interface

Also do i need to use LLMWrapper

LLMWrapper has been removed, so this probably doesn't need to be added. I'll figure it out later.

Copy link
Author

@thedebugger thedebugger Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Ultravox can be run with mixtral, should i also add its supported lora modules?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does one figure out what are supported lora modules in LlamaForCausalLM because Llama upstream have no packed modules? Just trying to understand how it all works

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh I think i have all that code if you check my pr. But vllm isn't activating the module because fully qualified module names don't match as per my previous comment. Does that make sense?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Treat UltraVox as an independent model, and its fully qualified module names differ from LLaMA's, you cannot use LLaMA's LoRA.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't the qualified module names specific to vLLM? AFAICT I have checked peft and I can't find this limitation. I'm writing up the code for peft to verify that. Maybe I'm missing something but it does seem odd to me that I can't apply an existing LLama Lora in Ultravox

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeejeelee you were right. after further troubleshooting, peft don't apply llama lora on ultravox either - it ignores missing modules silently until the recent version which logs warning if lora is missing model modules. So I ended up creating a transformed lora for ultravox for testing and added code in vllm to log a warning if lora module is missing for a model.

I have cleaned up the PR and added a test case which should give enough coverage. I'm checking few more things but the PR should be ready shortly

Copy link
Author

@thedebugger thedebugger Jan 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeejeelee AFAICT LlamaForCasualLM is used when loading mistral models. Is that correct? If yes, this PR should also cover mistral lora modules since PR includes applicable LlamaForCasualLM lora modules..right?

vllm/model_executor/models/ultravox.py Outdated Show resolved Hide resolved
vllm/model_executor/models/ultravox.py Outdated Show resolved Hide resolved
Copy link

mergify bot commented Dec 31, 2024

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @thedebugger.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Dec 31, 2024
@pytest.fixture(scope="session")
def llama3_1_8b_ultravox_chess_lora():
# ultravox chess lora is result of transformation of above chess llama lora
return snapshot_download(repo_id="thedebugger11/ultravox-chess-lora")
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done to align module names in the safetensor adapter file base_model.model -> base_model.model.language_model

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if anyone wants to use llama lora with ultravox, it can do that with this after doing this transformation

@thedebugger thedebugger force-pushed the svij-ultravox-lora-dec-16 branch 2 times, most recently from 771484d to 64a664f Compare December 31, 2024 18:49
@mergify mergify bot removed the needs-rebase label Dec 31, 2024
@thedebugger thedebugger changed the title WIP: Ultravox Support for LoRA Ultravox Support for LoRA Dec 31, 2024
@thedebugger thedebugger marked this pull request as ready for review December 31, 2024 18:52
WIP: lora tests

Minor tweaks

Moar fixes

Temp changes

Cleanup

Add more debugging logs and packed modules

Signed-off-by: Sumit Vij <[email protected]>
Remove stale comment

Add llama lora modules

Add llama test case

Add test case and log warning on missing lora modules

Rollback unwanted changes and format fixes

Signed-off-by: Sumit Vij <[email protected]>
@thedebugger thedebugger force-pushed the svij-ultravox-lora-dec-16 branch from 64a664f to 3f5996c Compare January 1, 2025 17:31
@jeejeelee
Copy link
Collaborator

Can you refer to #10022 to minimize the changes?

@thedebugger
Copy link
Author

Can you refer to #10022 to minimize the changes?

Changes are inline with 10022 except the test case and other minor logging changes. Do you have any concerns with any particular change?

@jeejeelee
Copy link
Collaborator

It looks like there are issues with both the added tests and logs. We should only modify the Ultravox scipt, following the changes made in the #10022

@thedebugger
Copy link
Author

What is/are the issue(s)? Maybe I miss something but tests are passing

@thedebugger
Copy link
Author

@jeejeelee lmk what are your concerns please? Happy to address it. Having a test case was super helpful to make sure LoRA works as expected with llama and ultravox

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants