Any performance comparsion with vllm? #12

MuYu-zhi · 2024-05-21T04:20:55Z

as title

kentang-mit · 2024-05-21T20:02:19Z

Hi,

We did not explicitly compare with vLLM because we believe its performance is worse than TRT-LLM-FP16 (which implements the same paged attention functionality but with a faster attention kernel). Our throughput is much better than TRT-LLM-FP16.

Best,
Haotian

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Any performance comparsion with vllm? #12

Any performance comparsion with vllm? #12

MuYu-zhi commented May 21, 2024

kentang-mit commented May 21, 2024

Any performance comparsion with vllm? #12

Any performance comparsion with vllm? #12

Comments

MuYu-zhi commented May 21, 2024

kentang-mit commented May 21, 2024