[Model] Support for fairseq2 Llama #11442

MartinGleize · 2024-12-24T00:14:50Z

This PR adds support for all Llama models produced with fairseq2. Its checkpoint format is mostly the same as the original Llama checkpoint format, up to layer renaming.
The PR provides a model with custom weight loading, and adds Fairseq2LlamaForCausalLM to the model registry.

github-actions · 2024-12-24T00:15:01Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

Signed-off-by: Martin Gleize <[email protected]>

robertgshaw2-neuralmagic · 2024-12-31T14:39:33Z

Could you add a simple example to the weight loader tests so we can confirm this wont break in the future?

MartinGleize · 2024-12-31T16:21:54Z

Could you add a simple example to the weight loader tests so we can confirm this wont break in the future?

Thank you for the advice, much appreciated. Just so that I'm clear, best way to do what you said is to add a HF model here: https://github.com/vllm-project/vllm/blob/main/tests/weight_loading/models.txt ?

ywang96

Thanks for the contribution! As @robertgshaw2-neuralmagic mentioned it would be great if you can add an example huggingface model repo in https://github.com/vllm-project/vllm/blob/main/tests/weight_loading/models.txt.

I'm also curious if it's worth adding Fairseq2LlamaForCausalLM to https://github.com/vllm-project/vllm/blob/main/docs/source/models/supported_models.md#list-of-text-only-language-models, but I'll leave it to your judgement!

ywang96 · 2024-12-31T20:35:41Z

vllm/model_executor/models/fairseq2_llama.py

+    def _init_model(self, vllm_config: VllmConfig, prefix: str = ""):
+        return LlamaModel(vllm_config=vllm_config, prefix=prefix)


Is there a reason why you needed to redefine this?

vllm/vllm/model_executor/models/llama.py

Lines 553 to 554 in 8c3230d

def _init_model(self, vllm_config: VllmConfig, prefix: str = ""):

return LlamaModel(vllm_config=vllm_config, prefix=prefix)

Good point, I had used this as a debugging point but forgot to remove it.

Signed-off-by: Martin Gleize <[email protected]>

MartinGleize changed the title ~~[Model] Add support for fairseq2 Llama~~ [Model] Support for fairseq2 Llama Dec 24, 2024

MartinGleize added 3 commits December 31, 2024 13:23

Add support for Fairseq2's Llama

c0c628b

Signed-off-by: Martin Gleize <[email protected]>

yapf

0519ddd

Signed-off-by: Martin Gleize <[email protected]>

Minor changes

25ce34c

Signed-off-by: Martin Gleize <[email protected]>

MartinGleize force-pushed the fairseq2_llama branch from aa73146 to 25ce34c Compare December 31, 2024 13:24

MartinGleize marked this pull request as ready for review December 31, 2024 13:43

ywang96 reviewed Dec 31, 2024

View reviewed changes

Remove unnecessary method override

19cc95d

Signed-off-by: Martin Gleize <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] Support for fairseq2 Llama #11442

[Model] Support for fairseq2 Llama #11442

MartinGleize commented Dec 24, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Dec 24, 2024

robertgshaw2-neuralmagic commented Dec 31, 2024

MartinGleize commented Dec 31, 2024

ywang96 left a comment

ywang96 Dec 31, 2024 •

edited

Loading

MartinGleize Jan 2, 2025

		def _init_model(self, vllm_config: VllmConfig, prefix: str = ""):
		return LlamaModel(vllm_config=vllm_config, prefix=prefix)

[Model] Support for fairseq2 Llama #11442

Are you sure you want to change the base?

[Model] Support for fairseq2 Llama #11442

Conversation

MartinGleize commented Dec 24, 2024 • edited by github-actions bot Loading

github-actions bot commented Dec 24, 2024

robertgshaw2-neuralmagic commented Dec 31, 2024

MartinGleize commented Dec 31, 2024

ywang96 left a comment

Choose a reason for hiding this comment

ywang96 Dec 31, 2024 • edited Loading

Choose a reason for hiding this comment

MartinGleize Jan 2, 2025

Choose a reason for hiding this comment

MartinGleize commented Dec 24, 2024 •

edited by github-actions bot

Loading

ywang96 Dec 31, 2024 •

edited

Loading