Skip to content

Commit

Permalink
Super tiny little typo fix (vllm-project#10633)
Browse files Browse the repository at this point in the history
  • Loading branch information
fzyzcjy authored Nov 25, 2024
1 parent ed46f14 commit 2b0879b
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/source/quantization/fp8_e5m2_kvcache.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ FP8 E5M2 KV Cache
==================

The int8/int4 quantization scheme requires additional scale GPU memory storage, which reduces the expected GPU memory benefits.
The FP8 data format retains 2~3 mantissa bits and can convert float/fp16/bflaot16 and fp8 to each other.
The FP8 data format retains 2~3 mantissa bits and can convert float/fp16/bfloat16 and fp8 to each other.

Here is an example of how to enable this feature:

Expand Down

0 comments on commit 2b0879b

Please sign in to comment.