From b05e6a885c4d099d8f70cb35eefc260bc541101c Mon Sep 17 00:00:00 2001 From: trigaten Date: Thu, 2 Jan 2025 22:10:42 -0500 Subject: [PATCH] typo (#26) --- chapters/01-introduction.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/chapters/01-introduction.md b/chapters/01-introduction.md index 815338b..53e3283 100644 --- a/chapters/01-introduction.md +++ b/chapters/01-introduction.md @@ -14,7 +14,7 @@ This book details key decisions and basic implementation examples for each step RLHF has been applied to many domains successfully, with complexity increasing as the techniques have matured. Early breakthrough experiments with RLHF were applied to deep reinforcement learning [@christiano2017deep], summarization [@stiennon2020learning], following instructions [@ouyang2022training], parsing web information for question answering [@nakano2021webgpt], and "alignment" [@bai2022training]. -In modern language model training, RLHF is one component on post-training. +In modern language model training, RLHF is one component of post-training. Post-training is a more complete set of techniques and best-practices to make language models more useful for downstream tasks [@lambert2024t]. Post-training can be summarized as using three optimization methods: