fix: bug

duterscmy · Dec 12, 2024 · fdf1c0c · fdf1c0c
1 parent ddc81a8
commit fdf1c0c
Showing 1 changed file with 5 additions and 5 deletions.
diff --git a/README.md b/README.md
@@ -116,21 +116,21 @@ Evaluate the pruned model:
 ```bash
 cp $expert_weight_file $greedy_search_expert_result_file $greedy_search_layer_result_file cd-moe/modeling_deepseek.py $model_path
 lm_eval --model hf \
-    --model_args $model_path \
-    --tasks arc-challenge,boolq,piqa,rte,obqa,winogrande,mmlu,hellaswag \
+    --model_args pretrained=$model_path,dtype="bfloat16",trust_remote_code=True \
+    --tasks arc_challenge,boolq,piqa,rte,obqa,winogrande,mmlu,hellaswag \
     --device cuda:0 \
     --batch_size 8
 ```
 Evaluate the fine-tuned model:
 ```bash
 lm_eval --model hf \
-    --model_args $sft_model_path \
-    --tasks arc-challenge,boolq,piqa,rte,obqa,winogrande,mmlu,hellaswag \
+    --model_args pretrained=$model_path,dtype="bfloat16",trust_remote_code=True,ignore_mismatched_sizes=True \
+    --tasks arc_challenge,boolq,piqa,rte,obqa,winogrande,mmlu,hellaswag \
     --device cuda:0 \
     --batch_size 8 \
     --ignore_mismatched_sizes
 ```
-`--ignore_mismatched_sizes` option is necessary because, during fine-tuning, to save GPU memory, the unnecessary expert parameters in the model are set to empty, causing a mismatch between the parameter sizes saved in the model file and the default parameter sizes in the model config.
+`ignore_mismatched_sizes=True` option is necessary because, during fine-tuning, to save GPU memory, the unnecessary expert parameters in the model are set to empty, causing a mismatch between the parameter sizes saved in the model file and the default parameter sizes in the model config.
 
 ## Acknowledgement
 This repository is build upon the [Transformers](https://github.com/huggingface/transformers) repositories.