Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird logits and model starts degeneration while training DPO #77

Open
DungNasSa10 opened this issue Apr 9, 2024 · 2 comments
Open

Comments

@DungNasSa10
Copy link

Recently, I have experimented DPO training for Vietnamese. I start with a strong SFT model, which is vinai/PhoGPT-4B-Chat, and follow the method described in CHEN, Zixiang, et al. Self-play fine-tuning converts weak language models to strong language models. arXiv preprint arXiv:2401.01335, 2024. to make preference dataset from my own SFT dataset. I use trl for traninig with the config:

  • Deepspeed zero 3 offload
  • beta = 0.1
  • global_batch_size 128
  • learning_rate 1e-6
  • learning_rate_scheduler cosine
  • optim adam_torch
  • bf16
    While training, the loss decreases very fast but after the first epoch, the logits of both chosen and rejected decreases to 0 and model suffer from degeneration (it generates repeated character `) after 1 epoch.
    Here is the full logs of the training process and a sample output of model, you can read more in column "PhoGPT-4B-Chat-SPIN-0-4K-one-turn-ep1" in the attached google sheet:
    434616463_1399270230726618_2100947106694925368_n
    Screenshot from 2024-04-09 14-54-15
    Screenshot from 2024-04-09 14-54-08

Do you have any suggest for this problem?

@AGTSAAA
Copy link

AGTSAAA commented May 8, 2024

Hi, Did you solve the problem?

@ggoggam
Copy link

ggoggam commented May 22, 2024

This seems to be a problem with DeepSpeed ZeRO 3. If I use FSDP, everything works fine.

I tried using torch's AdamW instead of DS FusedAdam, the problem persists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants