no code implementations • 27 Mar 2024 • Oliver Klingefjord, Ryan Lowe, Joe Edelman
In this paper, we focus on the first two parts, and ask the question: what are "good" ways to synthesize diverse human inputs about values into a target for aligning language models?