We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
推理脚本如下:
from vllm import LLM, SamplingParams import time from transformers import LlamaTokenizer tokenizer = LlamaTokenizer.from_pretrained('/yuan-2b-hf/', add_eos_token=False, add_bos_token=False, eos_token='<eod>') tokenizer.add_tokens(['<sep>', '<pad>', '<mask>', '<predict>', '<FIM_SUFFIX>', '<FIM_PREFIX>', '<FIM_MIDDLE>','<commit_before>','<commit_msg>','<commit_after>','<jupyter_start>','<jupyter_text>','<jupyter_code>','<jupyter_output>','<empty_output>'], special_tokens=True) prompts = ["青岛旅游推荐?","长江有多长?"] sampling_params = SamplingParams(max_tokens=300, temperature=1, top_p=0, top_k=1, min_p=0.0, length_penalty=1.0, repetition_penalty=1.0, stop="<eod>", ) llm = LLM(model="/yuan-2b-hf/", trust_remote_code=True, enforce_eager=True, tensor_parallel_size=4, gpu_memory_utilization=0.8, disable_custom_all_reduce=True, max_num_seqs=2) start_time = time.time() outputs = llm.generate(prompts, sampling_params) end_time = time.time() total_tokens = 0 for output in outputs: prompt = output.prompt generated_text = output.outputs[0].text num_tokens = len(tokenizer.encode(generated_text, return_tensors="pt")[0]) total_tokens += num_tokens print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}") print("inference_time:", (end_time - start_time)) print("total_tokens:", total_tokens)
输出结果如下:
Prompt: '青岛旅游推荐?', Generated text: ' 青岛旅游推荐如下:\n1.\n- 青岛旅游推荐青岛旅游。青岛旅游。青岛旅游推荐?\n青岛旅游。青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2.青岛旅游推荐?\n2旅游推荐?\n2.青岛旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2.青岛旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2\n2.青岛旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2旅游推荐?\n2\n' Prompt: '长江有多长?', Generated text: ' 长江的长度约为6,3,3,即6,即6,即长江有6,即6,即6,即长江的长度为6,即6,即长江的 6,即长江的 6,即6,即长江的 6,即长江的 6,即6,即6,即长江的 6,即6,即6,即长江的 6,即6,即长江的6,即6,即长江的 6,即6,即长江的 6,即6,即长江的,即6,即,即,即6,即长江的,即6,即6,即,即6,即6,即,即,即6,即6,即,即6,即,即6,即6,即,即6,即,即6,即6,即,即6,即6,即,即6,即,即6,即6,即,即6,即6,即6,即,即6,即,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6,即6'
The text was updated successfully, but these errors were encountered:
@IEI-mjx
Sorry, something went wrong.
No branches or pull requests
推理脚本如下:
输出结果如下:
The text was updated successfully, but these errors were encountered: