[bug] Conversion fails when using `layers_per_step` with some input formats #87

RaymondLi0 · 2024-12-04T23:33:17Z

🐞 Describe the Bug

Conversion fails when using layers_per_step together with input_format=fast_llm
example job: 7ada4a96-4b5d-43de-a156-ebea5f359a33

Global counter mismatch for parameter "layers.8.norm_1.weight" and shard "weights": 0 != 2048
[...]
Global counter mismatch for parameter "layers.17.output_weights" and shard "weights": 0 != 268435456

🔄 Steps to Reproduce

Convert a model exported in fast_llm format, using the layers_per_step argument

fast-llm convert gpt \
input.path=exp_dir/export/fast_llm/20000 \
input.format=fast_llm \
output.path=exp_dir/export/mixtral/20000 \
output.format=mixtral \
use_cpu=False \
exist_ok=True \
layers_per_step=8

🎯 Expected Behavior

Conversion succeeds

The text was updated successfully, but these errors were encountered:

RaymondLi0 added the bug Something isn't working label Dec 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bug] Conversion fails when using `layers_per_step` with some input formats #87

[bug] Conversion fails when using `layers_per_step` with some input formats #87

RaymondLi0 commented Dec 4, 2024 •

edited

Loading

[bug] Conversion fails when using layers_per_step with some input formats #87

[bug] Conversion fails when using layers_per_step with some input formats #87

Comments

RaymondLi0 commented Dec 4, 2024 • edited Loading

🐞 Describe the Bug

🔄 Steps to Reproduce

🎯 Expected Behavior

[bug] Conversion fails when using `layers_per_step` with some input formats #87

[bug] Conversion fails when using `layers_per_step` with some input formats #87

RaymondLi0 commented Dec 4, 2024 •

edited

Loading