Search Results for author: Daniel Ziegler

Found 1 papers, 1 papers with code

Learning to summarize with human feedback

1 code implementation NeurIPS 2020 Nisan Stiennon, Long Ouyang, Jeffrey Wu, Daniel Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, Paul F. Christiano

We collect a large, high-quality dataset of human comparisons between summaries, train a model to predict the human-preferred summary, and use that model as a reward function to fine-tune a summarization policy using reinforcement learning.

Cannot find the paper you are looking for? You can Submit a new open access paper.