no code implementations • 1 Jan 2023 • Ruibo Liu, Chenyan Jia, Ge Zhang, Ziyu Zhuang, Tony X Liu, Soroush Vosoughi
We present Second Thought, a new learning paradigm that enables language models (LMs) to re-align with human values.
reinforcement-learning Reinforcement Learning (RL) +1