saxml/tools/convert_llama_ckpt.py casts weights to float16, losing precision #28

houeland · 2024-07-28T16:34:23Z

The weights for e.g. Meta-Llama-3.1-70B-Instruct are distributed in bfloat16 format. When converting the weights, the saxml script first casts the weights to float16, which is lossy.

E.g. for Meta-Llama-3.1-70B-Instruct:

>>> example = torch.load('consolidated.01.pth', weights_only=True, map_location=torch.device('cpu'), mmap=True)['layers.79.feed_forward.w1.weight'][100][5685]
>>> example
tensor(-4.2617e-06, dtype=torch.bfloat16)
>>> example.type(torch.float16)
tensor(-4.2915e-06, dtype=torch.float16)

(This sounds similar to an issue HuggingFace had with weight conversion: huggingface/transformers#25446, which was acknowledged to degrade performance and was fixed.)

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

saxml/tools/convert_llama_ckpt.py casts weights to float16, losing precision #28

saxml/tools/convert_llama_ckpt.py casts weights to float16, losing precision #28

houeland commented Jul 28, 2024

saxml/tools/convert_llama_ckpt.py casts weights to float16, losing precision #28

saxml/tools/convert_llama_ckpt.py casts weights to float16, losing precision #28

Comments

houeland commented Jul 28, 2024