diff --git a/chapters/01-introduction.md b/chapters/01-introduction.md index 0e9b9ac..34091c5 100644 --- a/chapters/01-introduction.md +++ b/chapters/01-introduction.md @@ -14,7 +14,7 @@ This book details key decisions and basic implementation examples for each step RLHF has been applied to many domains successfully, with complexity increasing as the techniques have matured. Early breakthrough experiments with RLHF were applied to deep reinforcement learning [@christiano2017deep], summarization [@stiennon2020learning], follow instructions [@ouyang2022training], parse web information for question answering [@nakano2021webgpt], and "alignment" [@bai2022training]. -In modern language model training, RLHF is one component on post-training. +In modern language model training, RLHF is one component of post-training. Post-training is a more complete set of techniques and best-practices to make language models more useful for downstream tasks [@lambert2024t]. Post-training can be summarized as using three optimization methods: @@ -166,4 +166,4 @@ $$ y = mx + b $$ {#eq:equation} | 1 | BBB | | ... | ... | -Table: This is an example table. {#tbl:table} --> \ No newline at end of file +Table: This is an example table. {#tbl:table} -->