no code implementations • 26 Jan 2024 • Md Mushfiqur Rahman, Mohammad Sabik Irbaz, Kai North, Michelle S. Williams, Marcos Zampieri, Kevin Lybarger
Our innovative RLHF reward function surpassed existing RL text simplification reward functions in effectiveness.