The loss cannot converge when finetuning Llama2-7b-GPTQ on 4090 #16

cyita · 2023-11-30T08:40:19Z

I finetuned https://huggingface.co/TheBloke/Llama-2-7B-GPTQ on 4090 using the code from this repo and modified the group_size in peft_utils.py, but it seems cannot converge.

Only pass the learning rate = 3e-05 to qalora.py

StiphyJay · 2023-12-08T02:00:44Z

I finetuned https://huggingface.co/TheBloke/Llama-2-7B-GPTQ on 4090 using the code from this repo and modified the group_size in peft_utils.py, but it seems cannot converge.

Only pass the learning rate = 3e-05 to qalora.py

The same problem. Do you solve this problem?

duany049 · 2023-12-19T03:15:10Z

The same problem. Do you solve this problem?

The same problem. Do you solve this problem?

duany049 · 2023-12-19T03:15:16Z

any update?

StiphyJay · 2023-12-19T06:36:28Z

any update?

didn't fix.

duany049 · 2023-12-19T07:09:47Z

any update?

didn't fix.

我用W4G32，跟论文一样的配置去训练模型，结果依然不收敛。
你有试过用loss比较大的模型，跑评测么？若有，效果如何？

duany049 · 2023-12-19T08:02:56Z

any update?

didn't fix.

老哥，若不好解决，咱可以换一个方案。你有试到过 2bit量化效果比较好的方案么？

StiphyJay · 2023-12-25T06:35:10Z

any update?

didn't fix.

老哥，若不好解决，咱可以换一个方案。你有试到过 2bit量化效果比较好的方案么？

please refer PB-LLM.

duany049 · 2023-12-25T06:38:22Z

any update?

didn't fix.

老哥，若不好解决，咱可以换一个方案。你有试到过 2bit量化效果比较好的方案么？

please refer PB-LLM.

老哥，我已经解决这个问题了，需要替换auto-gptq中的peft-utils.py文件！

另外，PB-LLM中的QAT方法，你训练时怎么避免内存溢出呢？

duany049 · 2023-12-25T06:39:26Z

I have solved this problem by replacing path/auto_gptq/utils/peft_utils.py by the peft_utils.py file in the project

duany049 · 2023-12-25T06:43:55Z

any update?

didn't fix.

老哥，若不好解决，咱可以换一个方案。你有试到过 2bit量化效果比较好的方案么？

please refer PB-LLM.

老哥，我已经解决这个问题了，需要替换auto-gptq中的peft-utils.py文件！

另外，PB-LLM中的QAT方法，你训练时怎么避免内存溢出呢？

显存溢出，我80G的显存都不够用！

wenjingk-xilinx · 2024-01-18T12:11:37Z

Hi @duany049 @cyita @StiphyJay @yuhuixu1993 , could you help to have a look on issue, have you met similar problem? Many thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The loss cannot converge when finetuning Llama2-7b-GPTQ on 4090 #16

The loss cannot converge when finetuning Llama2-7b-GPTQ on 4090 #16

cyita commented Nov 30, 2023

StiphyJay commented Dec 8, 2023

duany049 commented Dec 19, 2023

duany049 commented Dec 19, 2023

StiphyJay commented Dec 19, 2023

duany049 commented Dec 19, 2023

duany049 commented Dec 19, 2023

StiphyJay commented Dec 25, 2023

duany049 commented Dec 25, 2023

duany049 commented Dec 25, 2023

duany049 commented Dec 25, 2023

wenjingk-xilinx commented Jan 18, 2024

The loss cannot converge when finetuning Llama2-7b-GPTQ on 4090 #16

The loss cannot converge when finetuning Llama2-7b-GPTQ on 4090 #16

Comments

cyita commented Nov 30, 2023

StiphyJay commented Dec 8, 2023

duany049 commented Dec 19, 2023

duany049 commented Dec 19, 2023

StiphyJay commented Dec 19, 2023

duany049 commented Dec 19, 2023

duany049 commented Dec 19, 2023

StiphyJay commented Dec 25, 2023

duany049 commented Dec 25, 2023

duany049 commented Dec 25, 2023

duany049 commented Dec 25, 2023

wenjingk-xilinx commented Jan 18, 2024