Search Results for author: Micah Carroll

Found 9 papers, 2 papers with code

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

no code implementations • 27 Jul 2023 • Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Wang, Samuel Marks, Charbel-Raphaël Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J. Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Biyik, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell

Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals.

reinforcement-learning

Paper
Add Code

Who Needs to Know? Minimal Knowledge for Optimal Coordination

no code implementations • 15 Jun 2023 • Niklas Lauffer, Ameesh Shah, Micah Carroll, Michael Dennis, Stuart Russell

We apply this algorithm to analyze the strategically relevant information for tasks in both a standard and a partially observable version of the Overcooked environment.

Paper
Add Code

Time-Efficient Reward Learning via Visually Assisted Cluster Ranking

no code implementations • 30 Nov 2022 • David Zhang, Micah Carroll, Andreea Bobu, Anca Dragan

One of the most successful paradigms for reward learning uses human feedback in the form of comparisons.

Dimensionality Reduction

Paper
Add Code

UniMASK: Unified Inference in Sequential Decision Problems

1 code implementation • 20 Nov 2022 • Micah Carroll, Orr Paradise, Jessy Lin, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin

Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks.

Decision Making

Paper
Code

Optimal Behavior Prior: Data-Efficient Human Models for Improved Human-AI Collaboration

no code implementations • 3 Nov 2022 • Mesut Yang, Micah Carroll, Anca Dragan

We show that using optimal behavior as a prior for human models makes these models vastly more data-efficient and able to generalize to new environments.

Paper
Add Code

Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers

no code implementations • 28 Apr 2022 • Micah Carroll, Jessy Lin, Orr Paradise, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin

Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks.

Decision Making Offline RL

Paper
Add Code

Estimating and Penalizing Induced Preference Shifts in Recommender Systems

no code implementations • 25 Apr 2022 • Micah Carroll, Anca Dragan, Stuart Russell, Dylan Hadfield-Menell

These steps involve two challenging ingredients: estimation requires anticipating how hypothetical algorithms would influence user preferences if deployed - we do this by using historical user interaction data to train a predictive user model which implicitly contains their preference dynamics; evaluation and optimization additionally require metrics to assess whether such influences are manipulative or otherwise unwanted - we use the notion of "safe shifts", that define a trust region within which behavior is safe: for instance, the natural way in which users would shift without interference from the system could be deemed "safe".

Recommendation Systems

Paper
Add Code

Evaluating the Robustness of Collaborative Agents

no code implementations • 14 Jan 2021 • Paul Knott, Micah Carroll, Sam Devlin, Kamil Ciosek, Katja Hofmann, A. D. Dragan, Rohin Shah

We apply this methodology to build a suite of unit tests for the Overcooked-AI environment, and use this test suite to evaluate three proposals for improving robustness.

Paper
Add Code

On the Utility of Learning about Humans for Human-AI Coordination

2 code implementations • NeurIPS 2019 • Micah Carroll, Rohin Shah, Mark K. Ho, Thomas L. Griffiths, Sanjit A. Seshia, Pieter Abbeel, Anca Dragan

While we would like agents that can coordinate with humans, current algorithms such as self-play and population-based training create agents that can coordinate with themselves.

635

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.