1 code implementation • 17 Oct 2023 • Dongyoung Go, Tomasz Korbak, Germán Kruszewski, Jos Rozen, Marc Dymetman
As language models (LMs) become more capable, it is increasingly important to align them with human preferences.
1 code implementation • 16 Feb 2023 • Dongyoung Go, Tomasz Korbak, Germán Kruszewski, Jos Rozen, Nahyeon Ryu, Marc Dymetman
We show that Jensen-Shannon divergence strikes a good balance between these objectives, and frequently outperforms forward KL divergence by a wide margin, leading to significant improvements over prior work.