Search Results for author: Anca D. Dragan

Found 62 papers, 15 papers with code

Quantifying Assistive Robustness Via the Natural-Adversarial Frontier

no code implementations • 16 Oct 2023 • Jerry Zhi-Yang He, Zackory Erickson, Daniel S. Brown, Anca D. Dragan

We propose that capturing robustness in these interactive settings requires constructing and analyzing the entire natural-adversarial frontier: the Pareto-frontier of human policies that are the best trade-offs between naturalness and low robot performance.

Paper
Add Code

Confronting Reward Model Overoptimization with Constrained RLHF

1 code implementation • 6 Oct 2023 • Ted Moskovitz, Aaditya K. Singh, DJ Strouse, Tuomas Sandholm, Ruslan Salakhutdinov, Anca D. Dragan, Stephen Mcaleer

Large language models are typically aligned with human preferences by optimizing $\textit{reward models}$ (RMs) fitted to human feedback.

Paper
Code

Bootstrapping Adaptive Human-Machine Interfaces with Offline Reinforcement Learning

no code implementations • 7 Sep 2023 • Jensen Gao, Siddharth Reddy, Glen Berseth, Anca D. Dragan, Sergey Levine

We further evaluate on a simulated Sawyer pushing task with eye gaze control, and the Lunar Lander game with simulated user commands, and find that our method improves over baseline interfaces in these domains as well.

Brain Computer Interface Decision Making +1

Paper
Add Code

Contextual Reliability: When Different Features Matter in Different Contexts

no code implementations • 19 Jul 2023 • Gaurav Ghosal, Amrith Setlur, Daniel S. Brown, Anca D. Dragan, aditi raghunathan

We formalize a new setting called contextual reliability which accounts for the fact that the "right" features to use may vary depending on the context.

Paper
Add Code

Aligning Robot and Human Representations

no code implementations • 3 Feb 2023 • Andreea Bobu, Andi Peng, Pulkit Agrawal, Julie Shah, Anca D. Dragan

To act in the world, robots rely on a representation of salient task aspects: for example, to carry a coffee mug, a robot may consider movement efficiency or mug orientation in its behavior.

Imitation Learning Representation Learning

Paper
Add Code

Benchmarks and Algorithms for Offline Preference-Based Reward Learning

no code implementations • 3 Jan 2023 • Daniel Shin, Anca D. Dragan, Daniel S. Brown

Learning a reward function from human preferences is challenging as it typically requires having a high-fidelity simulator or using expensive and potentially unsafe actual physical rollouts in the environment.

Active Learning Offline RL

Paper
Add Code

SIRL: Similarity-based Implicit Representation Learning

no code implementations • 2 Jan 2023 • Andreea Bobu, Yi Liu, Rohin Shah, Daniel S. Brown, Anca D. Dragan

This, in turn, is what enables the robot to disambiguate between what needs to go into the representation versus what is spurious, as well as what aspects of behavior can be compressed together versus not.

Contrastive Learning Data Augmentation +1

Paper
Add Code

Learning Representations that Enable Generalization in Assistive Tasks

no code implementations • 5 Dec 2022 • Jerry Zhi-Yang He, aditi raghunathan, Daniel S. Brown, Zackory Erickson, Anca D. Dragan

We advocate that generalization to such OOD policies benefits from (1) learning a good latent representation for human policies that test-time humans can accurately be mapped to, and (2) making that representation adaptable with test-time interaction data, instead of relying on it to perfectly capture the space of human policies based on the simulated population only.

Paper
Add Code

The Effect of Modeling Human Rationality Level on Learning Rewards from Multiple Feedback Types

no code implementations • 23 Aug 2022 • Gaurav R. Ghosal, Matthew Zurek, Daniel S. Brown, Anca D. Dragan

In this work, we advocate that grounding the rationality coefficient in real data for each feedback type, rather than assuming a default value, has a significant positive effect on reward learning.

Informativeness Vocal Bursts Type Prediction

Paper
Add Code

First Contact: Unsupervised Human-Machine Co-Adaptation via Mutual Information Maximization

1 code implementation • 24 May 2022 • Siddharth Reddy, Sergey Levine, Anca D. Dragan

How can we train an assistive human-machine interface (e. g., an electromyography-based limb prosthesis) to translate a user's raw command signals into the actions of a robot or computer when there is no prior mapping, we cannot ask the user for supervision in the form of action labels or reward feedback, and we do not have prior knowledge of the tasks the user is trying to accomplish?

Paper
Code

Causal Confusion and Reward Misidentification in Preference-Based Reward Learning

no code implementations • 13 Apr 2022 • Jeremy Tien, Jerry Zhi-Yang He, Zackory Erickson, Anca D. Dragan, Daniel S. Brown

While much prior work focuses on causal confusion in reinforcement learning and behavioral cloning, we focus on a systematic study of causal confusion and reward misidentification when learning from preferences.

Imitation Learning

Paper
Add Code

X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback

no code implementations • 4 Mar 2022 • Jensen Gao, Siddharth Reddy, Glen Berseth, Nicholas Hardy, Nikhilesh Natraj, Karunesh Ganguly, Anca D. Dragan, Sergey Levine

In the typing domain, we leverage backspaces as feedback that the interface did not perform the desired action.

Brain Computer Interface

Paper
Add Code

Teaching Robots to Span the Space of Functional Expressive Motion

no code implementations • 4 Mar 2022 • Arjun Sripathy, Andreea Bobu, Zhongyu Li, Koushil Sreenath, Daniel S. Brown, Anca D. Dragan

As a result 1) all user feedback can contribute to learning about every emotion; 2) the robot can generate trajectories for any emotion in the space instead of only a few predefined ones; and 3) the robot can respond emotively to user-generated natural language by mapping it to a target VAD.

Paper
Add Code

ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement Learning

no code implementations • 5 Feb 2022 • Sean Chen, Jensen Gao, Siddharth Reddy, Glen Berseth, Anca D. Dragan, Sergey Levine

Building assistive interfaces for controlling robots through arbitrary, high-dimensional, noisy inputs (e. g., webcam images of eye gaze) can be challenging, especially when it involves inferring the user's desired action in the absence of a natural 'default' interface.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Inducing Structure in Reward Learning by Learning Features

1 code implementation • 18 Jan 2022 • Andreea Bobu, Marius Wiggert, Claire Tomlin, Anca D. Dragan

To get around this issue, recent deep Inverse Reinforcement Learning (IRL) methods learn rewards directly from the raw state but this is challenging because the robot has to implicitly learn the features that are important and how to combine them, simultaneously.

Paper
Code

Assisted Robust Reward Design

no code implementations • 18 Nov 2021 • Jerry Zhi-Yang He, Anca D. Dragan

We contribute an Assisted Reward Design method that speeds up the design process by anticipating and influencing this future evidence: rather than letting the designer eventually encounter failure cases and revise the reward then, the method actively exposes the designer to such environments during the development phase.

Autonomous Driving

Paper
Add Code

Offline Preference-Based Apprenticeship Learning

no code implementations • 20 Jul 2021 • Daniel Shin, Daniel S. Brown, Anca D. Dragan

Active Learning Offline RL

Paper
Add Code

Pragmatic Image Compression for Human-in-the-Loop Decision-Making

1 code implementation • NeurIPS 2021 • Siddharth Reddy, Anca D. Dragan, Sergey Levine

Standard lossy image compression algorithms aim to preserve an image's appearance, while minimizing the number of bits needed to transmit it.

Car Racing Decision Making +1

Paper
Code

Physical Interaction as Communication: Learning Robot Objectives Online from Human Corrections

no code implementations • 6 Jul 2021 • Dylan P. Losey, Andrea Bajcsy, Marcia K. O'Malley, Anca D. Dragan

We recognize that physical human-robot interaction (pHRI) is often intentional -- the human intervenes on purpose because the robot is not doing the task correctly.

Paper
Add Code

Policy Gradient Bayesian Robust Optimization for Imitation Learning

no code implementations • 11 Jun 2021 • Zaynah Javed, Daniel S. Brown, Satvik Sharma, Jerry Zhu, Ashwin Balakrishna, Marek Petrik, Anca D. Dragan, Ken Goldberg

Results suggest that PG-BROIL can produce a family of behaviors ranging from risk-neutral to risk-averse and outperforms state-of-the-art imitation learning algorithms when learning from ambiguous demonstrations by hedging against uncertainty, rather than seeking to uniquely identify the demonstrator's reward function.

Imitation Learning

Paper
Add Code

Preference learning along multiple criteria: A game-theoretic perspective

no code implementations • NeurIPS 2020 • Kush Bhatia, Ashwin Pananjady, Peter L. Bartlett, Anca D. Dragan, Martin J. Wainwright

Finally, we showcase the practical utility of our framework in a user study on autonomous driving, where we find that the Blackwell winner outperforms the von Neumann winner for the overall preferences.

Autonomous Driving

Paper
Add Code

Optimal Cost Design for Model Predictive Control

1 code implementation • 23 Apr 2021 • Avik Jain, Lawrence Chan, Daniel S. Brown, Anca D. Dragan

We test our approach in an autonomous driving domain where we find costs different from the ground truth that implicitly compensate for replanning, short horizon, incorrect dynamics models, and local minima issues.

Autonomous Driving Model Predictive Control

Paper
Code

Agnostic learning with unknown utilities

no code implementations • 17 Apr 2021 • Kush Bhatia, Peter L. Bartlett, Anca D. Dragan, Jacob Steinhardt

This raises an interesting question whether learning is even possible in our setup, given that obtaining a generalizable estimate of utility $u^*$ might not be possible from finitely many samples.

Paper
Add Code

Situational Confidence Assistance for Lifelong Shared Autonomy

no code implementations • 14 Apr 2021 • Matthew Zurek, Andreea Bobu, Daniel S. Brown, Anca D. Dragan

Shared autonomy enables robots to infer user intent and assist in accomplishing it.

Paper
Add Code

Dynamically Switching Human Prediction Models for Efficient Planning

no code implementations • 13 Mar 2021 • Arjun Sripathy, Andreea Bobu, Daniel S. Brown, Anca D. Dragan

As environments involving both robots and humans become increasingly common, so does the need to account for people during planning.

Paper
Add Code

Analyzing Human Models that Adapt Online

no code implementations • 9 Mar 2021 • Andrea Bajcsy, Anand Siththaranjan, Claire J. Tomlin, Anca D. Dragan

This enables us to leverage tools from reachability analysis and optimal control to compute the set of hypotheses the robot could learn in finite time, as well as the worst and best-case time it takes to learn them.

Autonomous Driving

Paper
Add Code

On complementing end-to-end human behavior predictors with planning

no code implementations • 9 Mar 2021 • Liting Sun, Xiaogang Jia, Anca D. Dragan

High capacity end-to-end approaches for human motion (behavior) prediction have the ability to represent subtle nuances in human behavior, but struggle with robustness to out of distribution inputs and tail events.

Autonomous Driving Human motion prediction +2

Paper
Add Code

Value Alignment Verification

1 code implementation • 2 Dec 2020 • Daniel S. Brown, Jordan Schneider, Anca D. Dragan, Scott Niekum

In this paper we formalize and theoretically analyze the problem of efficient value alignment verification: how to efficiently test whether the behavior of another agent is aligned with a human's values.

Autonomous Driving

Paper
Code

Assisted Perception: Optimizing Observations to Communicate State

1 code implementation • 6 Aug 2020 • Siddharth Reddy, Sergey Levine, Anca D. Dragan

We evaluate ASE in a user study with 12 participants who each perform four tasks: two tasks with known user biases -- bandwidth-limited image classification and a driving video game with observation delay -- and two with unknown biases that our method has to learn -- guided 2D navigation and a lunar lander teleoperation video game.

Image Classification

Paper
Code

Feature Expansive Reward Learning: Rethinking Human Input

1 code implementation • 23 Jun 2020 • Andreea Bobu, Marius Wiggert, Claire Tomlin, Anca D. Dragan

When the correction cannot be explained by these features, recent work in deep Inverse Reinforcement Learning (IRL) suggests that the robot could ask for task demonstrations and recover a reward defined over the raw state space.

Paper
Code

Reward-rational (implicit) choice: A unifying formalism for reward learning

no code implementations • NeurIPS 2020 • Hong Jun Jeon, Smitha Milli, Anca D. Dragan

It is often difficult to hand-specify what the correct reward function is for a task, so researchers have instead aimed to learn reward functions from human behavior or feedback.

Paper
Add Code

Quantifying Hypothesis Space Misspecification in Learning from Human-Robot Demonstrations and Physical Corrections

no code implementations • 3 Feb 2020 • Andreea Bobu, Andrea Bajcsy, Jaime F. Fisac, Sampada Deglurkar, Anca D. Dragan

Recent work focuses on how robots can use such input - like demonstrations or corrections - to learn intended objectives.

Paper
Add Code

LESS is More: Rethinking Probabilistic Models of Human Behavior

no code implementations • 13 Jan 2020 • Andreea Bobu, Dexter R. R. Scobee, Jaime F. Fisac, S. Shankar Sastry, Anca D. Dragan

A common model is the Boltzmann noisily-rational decision model, which assumes people approximately optimize a reward function and choose trajectories in proportion to their exponentiated reward.

Econometrics

Paper
Add Code

Learning Human Objectives by Evaluating Hypothetical Behavior

1 code implementation • ICML 2020 • Siddharth Reddy, Anca D. Dragan, Sergey Levine, Shane Legg, Jan Leike

To address this challenge, we propose an algorithm that safely and interactively learns a model of the user's reward function.

Car Racing

Paper
Code

Nonverbal Robot Feedback for Human Teachers

no code implementations • 6 Nov 2019 • Sandy H. Huang, Isabella Huang, Ravi Pandya, Anca D. Dragan

Robots can learn preferences from human demonstrations, but their success depends on how informative these demonstrations are.

Paper
Add Code

A Hamilton-Jacobi Reachability-Based Framework for Predicting and Analyzing Human Motion for Safe Planning

no code implementations • 29 Oct 2019 • Somil Bansal, Andrea Bajcsy, Ellis Ratner, Anca D. Dragan, Claire J. Tomlin

We construct a new continuous-time dynamical system, where the inputs are the observations of human behavior, and the dynamics include how the belief over the model parameters change.

Bayesian Inference Human motion prediction +1

Paper
Add Code

Scaled Autonomy: Enabling Human Operators to Control Robot Fleets

no code implementations • 22 Sep 2019 • Gokul Swamy, Siddharth Reddy, Sergey Levine, Anca D. Dragan

We learn a model of the user's preferences from observations of the user's choices in easy settings with a few robots, and use it in challenging settings with more robots to automatically identify which robot the user would most likely choose to control, if they were able to evaluate the states of all robots at all times.

Robot Navigation

Paper
Add Code

Efficient Iterative Linear-Quadratic Approximations for Nonlinear Multi-Player General-Sum Differential Games

1 code implementation • 10 Sep 2019 • David Fridovich-Keil, Ellis Ratner, Anca D. Dragan, Claire J. Tomlin

We benchmark our method in a three-player general-sum simulated example, in which it takes < 0. 75 s to identify a solution and < 50 ms to solve warm-started subproblems in a receding horizon.

Systems and Control Robotics Systems and Control

120

Paper
Code

Bayesian Robustness: A Nonasymptotic Viewpoint

no code implementations • 27 Jul 2019 • Kush Bhatia, Yi-An Ma, Anca D. Dragan, Peter L. Bartlett, Michael. I. Jordan

We study the problem of robustly estimating the posterior distribution for the setting where observed data can be contaminated with potentially adversarial outliers.

Binary Classification regression

Paper
Add Code

On the Feasibility of Learning, Rather than Assuming, Human Biases for Reward Inference

no code implementations • 23 Jun 2019 • Rohin Shah, Noah Gundotra, Pieter Abbeel, Anca D. Dragan

But in the era of deep learning, a natural suggestion researchers make is to avoid mathematical models of human behavior that are fraught with specific assumptions, and instead use a purely data-driven approach.

Paper
Add Code

An Extensible Interactive Interface for Agent Design

no code implementations • 6 Jun 2019 • Matthew Rahtz, James Fang, Anca D. Dragan, Dylan Hadfield-Menell

In deep reinforcement learning, for example, directly specifying a reward as a function of a high-dimensional observation is challenging.

Reinforcement Learning (RL)

Paper
Add Code

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

5 code implementations • ICLR 2020 • Siddharth Reddy, Anca D. Dragan, Sergey Levine

Theoretically, we show that SQIL can be interpreted as a regularized variant of BC that uses a sparsity prior to encourage long-horizon imitation.

Imitation Learning Q-Learning +2

2,494

Paper
Code

Literal or Pedagogic Human? Analyzing Human Model Misspecification in Objective Learning

no code implementations • 9 Mar 2019 • Smitha Milli, Anca D. Dragan

In this work, we focus on misspecification: we argue that robots might not know whether people are being pedagogic or literal and that it is important to ask which assumption is safer to make.

Paper
Add Code

Human-AI Learning Performance in Multi-Armed Bandits

no code implementations • 21 Dec 2018 • Ravi Pandya, Sandy H. Huang, Dylan Hadfield-Menell, Anca D. Dragan

People frequently face challenging decision-making problems in which outcomes are uncertain or unknown.

Decision Making Multi-Armed Bandits

Paper
Add Code

Establishing Appropriate Trust via Critical States

no code implementations • 18 Oct 2018 • Sandy H. Huang, Kush Bhatia, Pieter Abbeel, Anca D. Dragan

In order to effectively interact with or supervise a robot, humans need to have an accurate mental model of its capabilities and how it acts.

Robotics

Paper
Add Code

Hierarchical Game-Theoretic Planning for Autonomous Vehicles

no code implementations • 13 Oct 2018 • Jaime F. Fisac, Eli Bronstein, Elis Stefansson, Dorsa Sadigh, S. Shankar Sastry, Anca D. Dragan

This mutual dependence, best captured by dynamic game theory, creates a strong coupling between the vehicle's planning and its predictions of other drivers' behavior, and constitutes an open problem with direct implications on the safety and viability of autonomous driving technology.

Autonomous Driving Decision Making +1

Paper
Add Code

Learning under Misspecified Objective Spaces

1 code implementation • 11 Oct 2018 • Andreea Bobu, Andrea Bajcsy, Jaime F. Fisac, Anca D. Dragan

Learning robot objective functions from human input has become increasingly important, but state-of-the-art techniques assume that the human's desired objective lies within the robot's hypothesis space.

Paper
Code

What Would pi* Do?: Imitation Learning via Off-Policy Reinforcement Learning

no code implementations • 27 Sep 2018 • Siddharth Reddy, Anca D. Dragan, Sergey Levine

Learning to imitate expert actions given demonstrations containing image observations is a difficult problem in robotic control.

Imitation Learning Q-Learning +2

Paper
Add Code

Cost Functions for Robot Motion Style

1 code implementation • 1 Sep 2018 • Allan Zhou, Anca D. Dragan

We focus on autonomously generating robot motion for day to day physical tasks that is expressive of a certain style or emotion.

Robotics

Paper
Code

The Social Cost of Strategic Classification

no code implementations • 25 Aug 2018 • Smitha Milli, John Miller, Anca D. Dragan, Moritz Hardt

Consequential decision-making typically incentivizes individuals to behave strategically, tailoring their behavior to the specifics of the decision rule.

Classification Decision Making +2

Paper
Add Code

Courteous Autonomous Cars

no code implementations • 8 Aug 2018 • Liting Sun, Wei Zhan, Masayoshi Tomizuka, Anca D. Dragan

Such a courtesy term enables the robot car to be aware of possible irrationality of the human behavior, and plan accordingly.

Paper
Add Code

Model Reconstruction from Model Explanations

no code implementations • 13 Jul 2018 • Smitha Milli, Ludwig Schmidt, Anca D. Dragan, Moritz Hardt

We show through theory and experiment that gradient-based explanations of a model quickly reveal the model itself.

Paper
Add Code

An Efficient, Generalized Bellman Update For Cooperative Inverse Reinforcement Learning

no code implementations • ICML 2018 • Dhruv Malik, Malayandi Palaniappan, Jaime F. Fisac, Dylan Hadfield-Menell, Stuart Russell, Anca D. Dragan

We apply this update to a variety of POMDP solvers and find that it enables us to scale CIRL to non-trivial problems, with larger reward parameter spaces, and larger action spaces for both robot and human.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Simplifying Reward Design through Divide-and-Conquer

no code implementations • 7 Jun 2018 • Ellis Ratner, Dylan Hadfield-Menell, Anca D. Dragan

Designing a good reward function is essential to robot planning and reinforcement learning, but it can also be challenging and frustrating.

Motion Planning

Paper
Add Code

Probabilistically Safe Robot Planning with Confidence-Based Human Predictions

no code implementations • 31 May 2018 • Jaime F. Fisac, Andrea Bajcsy, Sylvia L. Herbert, David Fridovich-Keil, Steven Wang, Claire J. Tomlin, Anca D. Dragan

In order to safely operate around humans, robots can employ predictive models of human motion.

Trajectory Planning

Paper
Add Code

Where Do You Think You're Going?: Inferring Beliefs about Dynamics from Behavior

1 code implementation • NeurIPS 2018 • Siddharth Reddy, Anca D. Dragan, Sergey Levine

Inferring intent from observed behavior has been studied extensively within the frameworks of Bayesian inverse planning and inverse reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Generating Plans that Predict Themselves

no code implementations • 14 Feb 2018 • Jaime F. Fisac, Chang Liu, Jessica B. Hamrick, S. Shankar Sastry, J. Karl Hedrick, Thomas L. Griffiths, Anca D. Dragan

We introduce $t$-\ACty{}: a measure that quantifies the accuracy and confidence with which human observers can predict the remaining robot plan from the overall task goal and the observed initial $t$ actions in the plan.

Paper
Add Code

Shared Autonomy via Deep Reinforcement Learning

1 code implementation • 6 Feb 2018 • Siddharth Reddy, Anca D. Dragan, Sergey Levine

In shared autonomy, user input is combined with semi-autonomous control to achieve a common goal.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Goal Inference Improves Objective and Perceived Performance in Human-Robot Collaboration

no code implementations • 6 Feb 2018 • Chang Liu, Jessica B. Hamrick, Jaime F. Fisac, Anca D. Dragan, J. Karl Hedrick, S. Shankar Sastry, Thomas L. Griffiths

The study of human-robot interaction is fundamental to the design and use of robotics in real-world applications.

Bayesian Inference

Paper
Add Code

Learning from Richer Human Guidance: Augmenting Comparison-Based Learning with Feature Queries

no code implementations • 5 Feb 2018 • Chandrayee Basu, Mukesh Singhal, Anca D. Dragan

We focus on learning the desired objective function for a robot.

Paper
Add Code

Pragmatic-Pedagogic Value Alignment

no code implementations • 20 Jul 2017 • Jaime F. Fisac, Monica A. Gates, Jessica B. Hamrick, Chang Liu, Dylan Hadfield-Menell, Malayandi Palaniappan, Dhruv Malik, S. Shankar Sastry, Thomas L. Griffiths, Anca D. Dragan

In robotics, value alignment is key to the design of collaborative robots that can integrate into human workflows, successfully inferring and adapting to their users' objectives as they go.

Decision Making

Paper
Add Code

Enabling Robots to Communicate their Objectives

no code implementations • 11 Feb 2017 • Sandy H. Huang, David Held, Pieter Abbeel, Anca D. Dragan

We show that certain approximate-inference models lead to the robot generating example behaviors that better enable users to anticipate what it will do in novel situations.

Autonomous Driving

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.