Search Results for author: Kellie Lu

Found 1 papers, 0 papers with code

RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

no code implementations • 1 Sep 2023 • Harrison Lee, Samrat Phatale, Hassan Mansoor, Thomas Mesnard, Johan Ferret, Kellie Lu, Colton Bishop, Ethan Hall, Victor Carbune, Abhinav Rastogi, Sushant Prakash

Reinforcement learning from human feedback (RLHF) has proven effective in aligning large language models (LLMs) with human preferences.

Dialogue Generation reinforcement-learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.