Skip to content

Commit

Permalink
fix: bug
Browse files Browse the repository at this point in the history
  • Loading branch information
duterscmy committed Dec 12, 2024
1 parent ddc81a8 commit fdf1c0c
Showing 1 changed file with 5 additions and 5 deletions.
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,21 +116,21 @@ Evaluate the pruned model:
```bash
cp $expert_weight_file $greedy_search_expert_result_file $greedy_search_layer_result_file cd-moe/modeling_deepseek.py $model_path
lm_eval --model hf \
--model_args $model_path \
--tasks arc-challenge,boolq,piqa,rte,obqa,winogrande,mmlu,hellaswag \
--model_args pretrained=$model_path,dtype="bfloat16",trust_remote_code=True \
--tasks arc_challenge,boolq,piqa,rte,obqa,winogrande,mmlu,hellaswag \
--device cuda:0 \
--batch_size 8
```
Evaluate the fine-tuned model:
```bash
lm_eval --model hf \
--model_args $sft_model_path \
--tasks arc-challenge,boolq,piqa,rte,obqa,winogrande,mmlu,hellaswag \
--model_args pretrained=$model_path,dtype="bfloat16",trust_remote_code=True,ignore_mismatched_sizes=True \
--tasks arc_challenge,boolq,piqa,rte,obqa,winogrande,mmlu,hellaswag \
--device cuda:0 \
--batch_size 8 \
--ignore_mismatched_sizes
```
`--ignore_mismatched_sizes` option is necessary because, during fine-tuning, to save GPU memory, the unnecessary expert parameters in the model are set to empty, causing a mismatch between the parameter sizes saved in the model file and the default parameter sizes in the model config.
`ignore_mismatched_sizes=True` option is necessary because, during fine-tuning, to save GPU memory, the unnecessary expert parameters in the model are set to empty, causing a mismatch between the parameter sizes saved in the model file and the default parameter sizes in the model config.

## Acknowledgement
This repository is build upon the [Transformers](https://github.com/huggingface/transformers) repositories.
Expand Down

0 comments on commit fdf1c0c

Please sign in to comment.