Skip to content

Commit

Permalink
wip
Browse files Browse the repository at this point in the history
  • Loading branch information
alexchen4ai committed Apr 30, 2024
1 parent 2190e3e commit ca31526
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions specialized_models/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,7 @@ If you really want a high-quality model, DPO training is highly recommended. We

One note is that when you conduct the DPO training, use the finetuned model from the SFT part as the base model. For more detail about PPO or reinforcement learning for language models, refer to this [blog](https://alexchen4ai.github.io/blog/notes/Large%20Language%20Model/rl_llm.html).

When you run the code, use
```bash
accelerate launch --num_processes=$NUM_GPUS dpo.py
```

0 comments on commit ca31526

Please sign in to comment.