-
(2023-05) LIMA: Less Is More for Alignment paper
-
(2023-05) RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs paper
-
(2023-05) Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision paper
-
(2023-05) Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback paper
-
(2023-04) Fundamental Limitations of Alignment in Large Language Models paper