Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

使用vLLM推理框架对Yuan2.0进行推理,开启tp并行且enforce_eager=True时推理结果有误 #8

Open
Sakurafwsv opened this issue Jul 8, 2024 · 1 comment

Comments

@Sakurafwsv
Copy link

推理脚本如下:

from vllm import LLM, SamplingParams
import time
from transformers import LlamaTokenizer

tokenizer = LlamaTokenizer.from_pretrained('/yuan-2b-hf/', add_eos_token=False, add_bos_token=False, eos_token='<eod>')
tokenizer.add_tokens(['<sep>', '<pad>', '<mask>', '<predict>', '<FIM_SUFFIX>', '<FIM_PREFIX>', '<FIM_MIDDLE>','<commit_before>','<commit_msg>','<commit_after>','<jupyter_start>','<jupyter_text>','<jupyter_code>','<jupyter_output>','<empty_output>'], special_tokens=True)

prompts = ["青岛旅游推荐?","长江有多长?"]
sampling_params = SamplingParams(max_tokens=300, temperature=1, top_p=0, top_k=1, min_p=0.0, length_penalty=1.0, repetition_penalty=1.0, stop="<eod>", )

llm = LLM(model="/yuan-2b-hf/", trust_remote_code=True, enforce_eager=True, tensor_parallel_size=4, gpu_memory_utilization=0.8, disable_custom_all_reduce=True, max_num_seqs=2)

start_time = time.time()
outputs = llm.generate(prompts, sampling_params)
end_time = time.time()
total_tokens = 0
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    num_tokens = len(tokenizer.encode(generated_text, return_tensors="pt")[0])
    total_tokens += num_tokens
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

print("inference_time:", (end_time - start_time))
print("total_tokens:", total_tokens)

输出结果如下:

Prompt: '青岛旅游推荐?', Generated text: ' 青岛旅游推荐如下:\n1.\n- 青岛旅游推荐青岛旅游。青岛旅游。青岛旅游推荐?\n青岛旅游。青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2旅游推荐?\n2.青岛旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2.青岛旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2\n2.青岛旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2\n'
Prompt: '长江有多长?', Generated text: ' 长江的长度约为6,3,3,即6,即6,即长江有6,即6,即6,即长江的长度为6,即6,即长江的 6,即长江的 6,即6,即长江的 6,即长江的 6,即6,即6,即长江的 6,即6,即6,即长江的 6,即6,即长江的6,即6,即长江的 6,即6,即长江的 6,即6,即长江的,即6,即,即,即6,即长江的,即6,即6,即,即6,即6,即,即,即6,即6,即,即6,即,即6,即6,即,即6,即,即6,即6,即,即6,即6,即,即6,即,即6,即6,即,即6,即6,即6,即,即6,即,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6'
@zhaoxudong01
Copy link
Collaborator

@IEI-mjx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants