no code implementations • 27 Jul 2023 • Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Wang, Samuel Marks, Charbel-Raphaël Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J. Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Biyik, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals.
1 code implementation • 3 Jul 2022 • Aditya Chattopadhyay, Stewart Slocum, Benjamin D. Haeffele, Rene Vidal, Donald Geman
There is a growing concern about typically opaque decision-making with high-performance machine learning algorithms.
1 code implementation • 5 Oct 2020 • Sam Sinai, Richard Wang, Alexander Whatley, Stewart Slocum, Elina Locane, Eric D. Kelsic
In this work, we implement an open-source Fitness Landscape EXploration Sandbox (FLEXS: github. com/samsinai/FLEXS) environment to test and evaluate these algorithms based on their optimality, consistency, and robustness.