You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The weights for e.g. Meta-Llama-3.1-70B-Instruct are distributed in bfloat16 format. When converting the weights, the saxml script first casts the weights to float16, which is lossy.
E.g. for Meta-Llama-3.1-70B-Instruct:
>>> example = torch.load('consolidated.01.pth', weights_only=True, map_location=torch.device('cpu'), mmap=True)['layers.79.feed_forward.w1.weight'][100][5685]
>>> example
tensor(-4.2617e-06, dtype=torch.bfloat16)
>>> example.type(torch.float16)
tensor(-4.2915e-06, dtype=torch.float16)
(This sounds similar to an issue HuggingFace had with weight conversion: huggingface/transformers#25446, which was acknowledged to degrade performance and was fixed.)
The text was updated successfully, but these errors were encountered:
The weights for e.g. Meta-Llama-3.1-70B-Instruct are distributed in bfloat16 format. When converting the weights, the saxml script first casts the weights to float16, which is lossy.
E.g. for Meta-Llama-3.1-70B-Instruct:
(This sounds similar to an issue HuggingFace had with weight conversion: huggingface/transformers#25446, which was acknowledged to degrade performance and was fixed.)
The text was updated successfully, but these errors were encountered: