Search Results for author: Daniel S. Brown

Found 33 papers, 10 papers with code

Reward Learning from Suboptimal Demonstrations with Applications in Surgical Electrocautery

no code implementations • 10 Apr 2024 • Zohre Karimi, Shing-Hei Ho, Bao Thach, Alan Kuntz, Daniel S. Brown

This paper introduces a sample-efficient method that learns a robust reward function from a limited amount of ranked suboptimal demonstrations consisting of partial-view point cloud observations.

Imitation Learning Reinforcement Learning (RL)

Paper
Add Code

Exploring Behavior Discovery Methods for Heterogeneous Swarms of Limited-Capability Robots

no code implementations • 25 Oct 2023 • Connor Mattson, Jeremy C. Clark, Daniel S. Brown

We study the problem of determining the emergent behaviors that are possible given a functionally heterogeneous swarm of robots with limited capabilities.

Clustering

Paper
Add Code

Quantifying Assistive Robustness Via the Natural-Adversarial Frontier

no code implementations • 16 Oct 2023 • Jerry Zhi-Yang He, Zackory Erickson, Daniel S. Brown, Anca D. Dragan

We propose that capturing robustness in these interactive settings requires constructing and analyzing the entire natural-adversarial frontier: the Pareto-frontier of human policies that are the best trade-offs between naturalness and low robot performance.

Paper
Add Code

Contextual Reliability: When Different Features Matter in Different Contexts

no code implementations • 19 Jul 2023 • Gaurav Ghosal, Amrith Setlur, Daniel S. Brown, Anca D. Dragan, aditi raghunathan

We formalize a new setting called contextual reliability which accounts for the fact that the "right" features to use may vary depending on the context.

Paper
Add Code

Can Differentiable Decision Trees Learn Interpretable Reward Functions?

no code implementations • 22 Jun 2023 • Akansha Kalra, Daniel S. Brown

There is an increasing interest in learning reward functions that model human preferences.

Atari Games

Paper
Add Code

Leveraging Human Feedback to Evolve and Discover Novel Emergent Behaviors in Robot Swarms

no code implementations • 25 Apr 2023 • Connor Mattson, Daniel S. Brown

We combine our learned similarity metric with novelty search and clustering to explore and categorize the space of possible swarm behaviors.

Self-Supervised Learning

Paper
Add Code

Efficient Preference-Based Reinforcement Learning Using Learned Dynamics Models

no code implementations • 11 Jan 2023 • Yi Liu, Gaurav Datta, Ellen Novoseller, Daniel S. Brown

In particular, we provide evidence that a learned dynamics model offers the following benefits when performing PbRL: (1) preference elicitation and policy optimization require significantly fewer environment interactions than model-free PbRL, (2) diverse preference queries can be synthesized safely and efficiently as a byproduct of standard model-based RL, and (3) reward pre-training based on suboptimal demonstrations can be performed without any environmental interaction.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Benchmarks and Algorithms for Offline Preference-Based Reward Learning

no code implementations • 3 Jan 2023 • Daniel Shin, Anca D. Dragan, Daniel S. Brown

Learning a reward function from human preferences is challenging as it typically requires having a high-fidelity simulator or using expensive and potentially unsafe actual physical rollouts in the environment.

Active Learning Offline RL

Paper
Add Code

SIRL: Similarity-based Implicit Representation Learning

no code implementations • 2 Jan 2023 • Andreea Bobu, Yi Liu, Rohin Shah, Daniel S. Brown, Anca D. Dragan

This, in turn, is what enables the robot to disambiguate between what needs to go into the representation versus what is spurious, as well as what aspects of behavior can be compressed together versus not.

Contrastive Learning Data Augmentation +1

Paper
Add Code

Learning Representations that Enable Generalization in Assistive Tasks

no code implementations • 5 Dec 2022 • Jerry Zhi-Yang He, aditi raghunathan, Daniel S. Brown, Zackory Erickson, Anca D. Dragan

We advocate that generalization to such OOD policies benefits from (1) learning a good latent representation for human policies that test-time humans can accurately be mapped to, and (2) making that representation adaptable with test-time interaction data, instead of relying on it to perfectly capture the space of human policies based on the simulated population only.

Paper
Add Code

Autonomous Assessment of Demonstration Sufficiency via Bayesian Inverse Reinforcement Learning

no code implementations • 28 Nov 2022 • Tu Trinh, Haoyu Chen, Daniel S. Brown

We evaluate our approach in simulation for both discrete and continuous state-space domains and illustrate the feasibility of developing a robotic system that can accurately evaluate demonstration sufficiency.

Active Learning reinforcement-learning +1

Paper
Add Code

Monte Carlo Augmented Actor-Critic for Sparse Reward Deep Reinforcement Learning from Suboptimal Demonstrations

no code implementations • 14 Oct 2022 • Albert Wilcox, Ashwin Balakrishna, Jules Dedieu, Wyame Benslimane, Daniel S. Brown, Ken Goldberg

Providing densely shaped reward functions for RL algorithms is often exceedingly challenging, motivating the development of RL algorithms that can learn from easier-to-specify sparse reward functions.

Continuous Control

Paper
Add Code

The Effect of Modeling Human Rationality Level on Learning Rewards from Multiple Feedback Types

no code implementations • 23 Aug 2022 • Gaurav R. Ghosal, Matthew Zurek, Daniel S. Brown, Anca D. Dragan

In this work, we advocate that grounding the rationality coefficient in real data for each feedback type, rather than assuming a default value, has a significant positive effect on reward learning.

Informativeness Vocal Bursts Type Prediction

Paper
Add Code

Causal Confusion and Reward Misidentification in Preference-Based Reward Learning

no code implementations • 13 Apr 2022 • Jeremy Tien, Jerry Zhi-Yang He, Zackory Erickson, Anca D. Dragan, Daniel S. Brown

While much prior work focuses on causal confusion in reinforcement learning and behavioral cloning, we focus on a systematic study of causal confusion and reward misidentification when learning from preferences.

Imitation Learning

Paper
Add Code

Teaching Robots to Span the Space of Functional Expressive Motion

no code implementations • 4 Mar 2022 • Arjun Sripathy, Andreea Bobu, Zhongyu Li, Koushil Sreenath, Daniel S. Brown, Anca D. Dragan

As a result 1) all user feedback can contribute to learning about every emotion; 2) the robot can generate trajectories for any emotion in the space instead of only a few predefined ones; and 3) the robot can respond emotively to user-generated natural language by mapping it to a target VAD.

Paper
Add Code

ThriftyDAgger: Budget-Aware Novelty and Risk Gating for Interactive Imitation Learning

no code implementations • 17 Sep 2021 • Ryan Hoque, Ashwin Balakrishna, Ellen Novoseller, Albert Wilcox, Daniel S. Brown, Ken Goldberg

Effective robot learning often requires online human feedback and interventions that can cost significant human time, giving rise to the central challenge in interactive imitation learning: is it possible to control the timing and length of interventions to both facilitate learning and limit burden on the human supervisor?

Imitation Learning

Paper
Add Code

Offline Preference-Based Apprenticeship Learning

no code implementations • 20 Jul 2021 • Daniel Shin, Daniel S. Brown, Anca D. Dragan

Active Learning Offline RL

Paper
Add Code

Kit-Net: Self-Supervised Learning to Kit Novel 3D Objects into Novel 3D Cavities

1 code implementation • 13 Jul 2021 • Shivin Devgon, Jeffrey Ichnowski, Michael Danielczuk, Daniel S. Brown, Ashwin Balakrishna, Shirin Joshi, Eduardo M. C. Rocha, Eugen Solowjow, Ken Goldberg

In industrial part kitting, 3D objects are inserted into cavities for transportation or subsequent assembly.

Data Augmentation Self-Supervised Learning

Paper
Code

Policy Gradient Bayesian Robust Optimization for Imitation Learning

no code implementations • 11 Jun 2021 • Zaynah Javed, Daniel S. Brown, Satvik Sharma, Jerry Zhu, Ashwin Balakrishna, Marek Petrik, Anca D. Dragan, Ken Goldberg

Results suggest that PG-BROIL can produce a family of behaviors ranging from risk-neutral to risk-averse and outperforms state-of-the-art imitation learning algorithms when learning from ambiguous demonstrations by hedging against uncertainty, rather than seeking to uniquely identify the demonstrator's reward function.

Imitation Learning

Paper
Add Code

Optimal Cost Design for Model Predictive Control

1 code implementation • 23 Apr 2021 • Avik Jain, Lawrence Chan, Daniel S. Brown, Anca D. Dragan

We test our approach in an autonomous driving domain where we find costs different from the ground truth that implicitly compensate for replanning, short horizon, incorrect dynamics models, and local minima issues.

Autonomous Driving Model Predictive Control

Paper
Code

Situational Confidence Assistance for Lifelong Shared Autonomy

no code implementations • 14 Apr 2021 • Matthew Zurek, Andreea Bobu, Daniel S. Brown, Anca D. Dragan

Shared autonomy enables robots to infer user intent and assist in accomplishing it.

Paper
Add Code

LazyDAgger: Reducing Context Switching in Interactive Imitation Learning

no code implementations • 31 Mar 2021 • Ryan Hoque, Ashwin Balakrishna, Carl Putterman, Michael Luo, Daniel S. Brown, Daniel Seita, Brijen Thananjeyan, Ellen Novoseller, Ken Goldberg

Corrective interventions while a robot is learning to automate a task provide an intuitive method for a human supervisor to assist the robot and convey information about desired behavior.

Continuous Control Imitation Learning

Paper
Add Code

Dynamically Switching Human Prediction Models for Efficient Planning

no code implementations • 13 Mar 2021 • Arjun Sripathy, Andreea Bobu, Daniel S. Brown, Anca D. Dragan

As environments involving both robots and humans become increasingly common, so does the need to account for people during planning.

Paper
Add Code

Value Alignment Verification

1 code implementation • 2 Dec 2020 • Daniel S. Brown, Jordan Schneider, Anca D. Dragan, Scott Niekum

In this paper we formalize and theoretically analyze the problem of efficient value alignment verification: how to efficiently test whether the behavior of another agent is aligned with a human's values.

Autonomous Driving

Paper
Code

Exploratory Grasping: Asymptotically Optimal Algorithms for Grasping Challenging Polyhedral Objects

no code implementations • 11 Nov 2020 • Michael Danielczuk, Ashwin Balakrishna, Daniel S. Brown, Shivin Devgon, Ken Goldberg

However, these policies can consistently fail to grasp challenging objects which are significantly out of the distribution of objects in the training data or which have very few high quality grasps.

Paper
Add Code

Bayesian Robust Optimization for Imitation Learning

1 code implementation • NeurIPS 2020 • Daniel S. Brown, Scott Niekum, Marek Petrik

Existing safe imitation learning approaches based on IRL deal with this uncertainty using a maxmin framework that optimizes a policy under the assumption of an adversarial reward function, whereas risk-neutral IRL approaches either optimize a policy for the mean or MAP reward function.

Imitation Learning reinforcement-learning +1

Paper
Code

Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences

1 code implementation • ICML 2020 • Daniel S. Brown, Russell Coleman, Ravi Srinivasan, Scott Niekum

Bayesian REX can learn to play Atari games from demonstrations, without access to the game score and can generate 100, 000 samples from the posterior over reward functions in only 5 minutes on a personal laptop.

Atari Games Bayesian Inference +1

Paper
Code

Deep Bayesian Reward Learning from Preferences

no code implementations • 10 Dec 2019 • Daniel S. Brown, Scott Niekum

Bayesian inverse reinforcement learning (IRL) methods are ideal for safe imitation learning, as they allow a learning agent to reason about reward uncertainty and the safety of a learned policy.

Atari Games Imitation Learning

Paper
Add Code

Better-than-Demonstrator Imitation Learning via Automatically-Ranked Demonstrations

2 code implementations • 9 Jul 2019 • Daniel S. Brown, Wonjoon Goo, Scott Niekum

The performance of imitation learning is typically upper-bounded by the performance of the demonstrator.

Imitation Learning reinforcement-learning +1

2,523

Paper
Code

Extrapolating Beyond Suboptimal Demonstrations via Inverse Reinforcement Learning from Observations

3 code implementations • 12 Apr 2019 • Daniel S. Brown, Wonjoon Goo, Prabhat Nagarajan, Scott Niekum

A critical flaw of existing inverse reinforcement learning (IRL) methods is their inability to significantly outperform the demonstrator.

Imitation Learning reinforcement-learning +1

2,523

Paper
Code

Risk-Aware Active Inverse Reinforcement Learning

2 code implementations • 8 Jan 2019 • Daniel S. Brown, Yuchen Cui, Scott Niekum

Active learning from demonstration allows a robot to query a human for specific types of input to achieve efficient learning.

Active Learning reinforcement-learning +1

Paper
Code

Machine Teaching for Inverse Reinforcement Learning: Algorithms and Applications

1 code implementation • 20 May 2018 • Daniel S. Brown, Scott Niekum

Inverse reinforcement learning (IRL) infers a reward function from demonstrations, allowing for policy improvement and generalization.

Decision Making reinforcement-learning +1

Paper
Code

Efficient Probabilistic Performance Bounds for Inverse Reinforcement Learning

3 code implementations • 3 Jul 2017 • Daniel S. Brown, Scott Niekum

In the field of reinforcement learning there has been recent progress towards safety and high-confidence bounds on policy performance.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.