Skip to content

Commit

Permalink
typo (#26)
Browse files Browse the repository at this point in the history
  • Loading branch information
trigaten authored Jan 3, 2025
1 parent 5df6b68 commit b05e6a8
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion chapters/01-introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ This book details key decisions and basic implementation examples for each step
RLHF has been applied to many domains successfully, with complexity increasing as the techniques have matured.
Early breakthrough experiments with RLHF were applied to deep reinforcement learning [@christiano2017deep], summarization [@stiennon2020learning], following instructions [@ouyang2022training], parsing web information for question answering [@nakano2021webgpt], and "alignment" [@bai2022training].

In modern language model training, RLHF is one component on post-training.
In modern language model training, RLHF is one component of post-training.
Post-training is a more complete set of techniques and best-practices to make language models more useful for downstream tasks [@lambert2024t].
Post-training can be summarized as using three optimization methods:

Expand Down

0 comments on commit b05e6a8

Please sign in to comment.