Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【LLM】模型参数支持列表 #8663

Open
DrownFish19 opened this issue Jun 26, 2024 · 2 comments
Open

【LLM】模型参数支持列表 #8663

DrownFish19 opened this issue Jun 26, 2024 · 2 comments
Assignees

Comments

@DrownFish19
Copy link
Collaborator

DrownFish19 commented Jun 26, 2024

模型参数支持专区

大家好,PaddleNLP 团队在这里为大家整理了各个模型参数的详细信息,方便大家使用。

模型参数

Base Models

Model 0.5B 1~2B 3~4B 6~8B 13~14B 30~32B 50~60B 65~72B 110B >110B
LLaMA
LLaMA2
LLaMA3
LLaMA3.1
Baichuan
Baichuan2
Bloom
ChatGLM
ChatGLM2
ChatGLM3
Gemma
Mistral
Mixtral
OPT
Qwen
Qwen1.5
Qwen2
Yuan2

Chat Models

Model 0.5B 1~2B 3~4B 6~8B 13~14B 30~32B 50~60B 65~72B 110B >110B
LLaMA
LLaMA2
LLaMA3
LLaMA3.1
Baichuan
Baichuan2
Bloom
ChatGLM
ChatGLM2
ChatGLM3
Gemma
Mistral
Mixtral
OPT
Qwen
Qwen1.5
Qwen2
Yuan2
模型系列 模型名称
LLaMA facebook/llama-7b, facebook/llama-13b, facebook/llama-30b, facebook/llama-65b
Llama2 meta-llama/Llama-2-7b, meta-llama/Llama-2-7b-chat, meta-llama/Llama-2-13b, meta-llama/Llama-2-13b-chat, meta-llama/Llama-2-70b, meta-llama/Llama-2-70b-chat
Llama3 meta-llama/Meta-Llama-3-8B, meta-llama/Meta-Llama-3-8B-Instruct, meta-llama/Meta-Llama-3-70B, meta-llama/Meta-Llama-3-70B-Instruct
Llama3.1 meta-llama/Meta-Llama-3.1-8B, meta-llama/Meta-Llama-3.1-8B-Instruct, meta-llama/Meta-Llama-3.1-70B, meta-llama/Meta-Llama-3.1-70B-Instruct, meta-llama/Meta-Llama-3.1-405B, meta-llama/Meta-Llama-3.1-405B-Instruct, meta-llama/Llama-Guard-3-8B
Baichuan baichuan-inc/Baichuan-7B, baichuan-inc/Baichuan-13B-Base, baichuan-inc/Baichuan-13B-Chat
Baichuan2 baichuan-inc/Baichuan2-7B-Base, baichuan-inc/Baichuan2-7B-Chat, baichuan-inc/Baichuan2-13B-Base, baichuan-inc/Baichuan2-13B-Chat
BLOOM bigscience/bloom-560m, bigscience/bloom-560m-bf16, bigscience/bloom-1b1, bigscience/bloom-3b, bigscience/bloom-7b1, bigscience/bloomz-560m, bigscience/bloomz-1b1, bigscience/bloomz-3b, bigscience/bloomz-7b1-mt, bigscience/bloomz-7b1-p3, bigscience/bloomz-7b1, bellegroup/belle-7b-2m
ChatGLM THUDM/chatglm-6b, THUDM/chatglm-6b-v1.1
ChatGLM2 THUDM/chatglm2-6b
ChatGLM3 THUDM/chatglm3-6b
Gemma google/gemma-7b, google/gemma-7b-it, google/gemma-2b, google/gemma-2b-it
Mistral mistralai/Mistral-7B-Instruct-v0.3, mistralai/Mistral-7B-v0.1
Mixtral mistralai/Mixtral-8x7B-Instruct-v0.1
OPT facebook/opt-125m, facebook/opt-350m, facebook/opt-1.3b, facebook/opt-2.7b, facebook/opt-6.7b, facebook/opt-13b, facebook/opt-30b, facebook/opt-66b, facebook/opt-iml-1.3b, opt-iml-max-1.3b
Qwen qwen/qwen-7b, qwen/qwen-7b-chat, qwen/qwen-14b, qwen/qwen-14b-chat, qwen/qwen-72b, qwen/qwen-72b-chat,
Qwen1.5 Qwen/Qwen1.5-0.5B, Qwen/Qwen1.5-0.5B-Chat, Qwen/Qwen1.5-1.8B, Qwen/Qwen1.5-1.8B-Chat, Qwen/Qwen1.5-4B, Qwen/Qwen1.5-4B-Chat, Qwen/Qwen1.5-7B, Qwen/Qwen1.5-7B-Chat, Qwen/Qwen1.5-14B, Qwen/Qwen1.5-14B-Chat, Qwen/Qwen1.5-32B, Qwen/Qwen1.5-32B-Chat, Qwen/Qwen1.5-72B, Qwen/Qwen1.5-72B-Chat, Qwen/Qwen1.5-110B, Qwen/Qwen1.5-110B-Chat, Qwen/Qwen1.5-MoE-A2.7B, Qwen/Qwen1.5-MoE-A2.7B-Chat
Qwen2 Qwen/Qwen2-0.5B, Qwen/Qwen2-0.5B-Instruct, Qwen/Qwen2-1.5B, Qwen/Qwen2-1.5B-Instruct, Qwen/Qwen2-7B, Qwen/Qwen2-7B-Instruct, Qwen/Qwen2-72B, Qwen/Qwen2-72B-Instruct, Qwen/Qwen2-57B-A14B, Qwen/Qwen2-57B-A14B-Instruct
Qwen2-Math Qwen/Qwen2-Math-1.5B, Qwen/Qwen2-Math-1.5B-Instruct, Qwen/Qwen2-Math-7B, Qwen/Qwen2-Math-7B-Instruct, Qwen/Qwen2-Math-72B, Qwen/Qwen2-Math-72B-Instruct, Qwen/Qwen2-Math-RM-72B
Qwen2.5 Qwen/Qwen2.5-0.5B, Qwen/Qwen2.5-0.5B-Instruct, Qwen/Qwen2.5-1.5B, Qwen/Qwen2.5-1.5B-Instruct, Qwen/Qwen2.5-3B, Qwen/Qwen2.5-3B-Instruct, Qwen/Qwen2.5-7B, Qwen/Qwen2.5-7B-Instruct, Qwen/Qwen2.5-14B, Qwen/Qwen2.5-14B-Instruct, Qwen/Qwen2.5-32B, Qwen/Qwen2.5-32B-Instruct, Qwen/Qwen2.5-72B, Qwen/Qwen2.5-72B-Instruct
Qwen2.5-Math Qwen/Qwen2.5-Math-1.5B, Qwen/Qwen2.5-Math-1.5B-Instruct, Qwen/Qwen2.5-Math-7B, Qwen/Qwen2.5-Math-7B-Instruct, Qwen/Qwen2.5-Math-72B, Qwen/Qwen2.5-Math-72B-Instruct, Qwen/Qwen2.5-Math-RM-72B
Qwen2.5-Coder Qwen/Qwen2.5-Coder-1.5B, Qwen/Qwen2.5-Coder-1.5B-Instruct, Qwen/Qwen2.5-Coder-7B, Qwen/Qwen2.5-Coder-7B-Instruct
Yuan2 IEITYuan/Yuan2-2B, IEITYuan/Yuan2-51B, IEITYuan/Yuan2-102B
@DrownFish19 DrownFish19 added the others unknown issue type label Jun 26, 2024
@DrownFish19 DrownFish19 changed the title Support Model List 【LLM】模型支持列表 Jun 26, 2024
@DrownFish19 DrownFish19 assigned DrownFish19 and unassigned KB-Ding Jun 26, 2024
@jzhang533
Copy link
Collaborator

links in the table are broken.

@DrownFish19 DrownFish19 added LLM and removed others unknown issue type labels Jul 11, 2024
@ZHUI ZHUI pinned this issue Jul 11, 2024
@DrownFish19 DrownFish19 changed the title 【LLM】模型支持列表 【LLM】模型参数支持列表 Jul 11, 2024
Copy link

This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。

@github-actions github-actions bot added the stale label Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants