Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama3 reports shape error after pruning #69

Open
WentaoTan opened this issue Aug 22, 2024 · 7 comments
Open

Llama3 reports shape error after pruning #69

WentaoTan opened this issue Aug 22, 2024 · 7 comments

Comments

@WentaoTan
Copy link

The command I run:
'''
python llama3.py --pruning_ratio 0.25
--device cuda --eval_device cuda
--base_model home/Meta-Llama-3-8B
--block_wise --block_mlp_layer_start 4 --block_mlp_layer_end 30
--block_attention_layer_start 4 --block_attention_layer_end 30
--save_ckpt_log_name llama3_prune
--pruner_type taylor --taylor param_first
--max_seq_len 2048
--test_after_train --test_before_train --save_model
'''
1724320850470
When running to line 259 of the code, an error occurs:
1724320929175
How to solve this problem? Looking forward to your reply!

@nagbhat25
Copy link

+1
I see the same issue too. tried with multiple settings but no luck. Any help would be appreciated.

@xwang365
Copy link

+1 got same error

1 similar comment
@SunnyGJing
Copy link

+1 got same error

@WentaoTan
Copy link
Author

Has anyone solved this problem?

@VincentZ-2020
Copy link

The line attn_output = attn_output.reshape(bsz, q_len, self.hidden_size) can be modified to attn_output = attn_output.reshape(bsz, q_len, -1) to resolve the issue.

@PureEidolon
Copy link

got same error

@PureEidolon
Copy link

The line attn_output = attn_output.reshape(bsz, q_len, self.hidden_size) can be modified to attn_output = attn_output.reshape(bsz, q_len, -1) to resolve the issue.

I tried this method, but it doesn't work.

In the modeling_llama.py file I downloaded, the source code is:

attn_output = attn_output.reshape(bsz, q_len, -1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants