Search Results for author: Erdem Biyik

Found 28 papers, 10 papers with code

ViSaRL: Visual Reinforcement Learning Guided by Human Saliency

no code implementations16 Mar 2024 Anthony Liang, Jesse Thomason, Erdem Biyik

Using ViSaRL to learn visual representations significantly improves the success rate, sample efficiency, and generalization of an RL agent on diverse tasks including DeepMind Control benchmark, robot manipulation in simulation and on a real robot.

reinforcement-learning Reinforcement Learning (RL) +1

A Generalized Acquisition Function for Preference-based Reward Learning

no code implementations9 Mar 2024 Evan Ellis, Gaurav R. Ghosal, Stuart J. Russell, Anca Dragan, Erdem Biyik

Preference-based reward learning is a popular technique for teaching robots and autonomous systems how a human user wants them to perform a task.

DynaMITE-RL: A Dynamic Model for Improved Temporal Meta-Reinforcement Learning

no code implementations25 Feb 2024 Anthony Liang, Guy Tennenholtz, Chih-Wei Hsu, Yinlam Chow, Erdem Biyik, Craig Boutilier

We introduce DynaMITE-RL, a meta-reinforcement learning (meta-RL) approach to approximate inference in environments where the latent state evolves at varying rates.

Continuous Control Meta Reinforcement Learning

Batch Active Learning of Reward Functions from Human Preferences

no code implementations24 Feb 2024 Erdem Biyik, Nima Anari, Dorsa Sadigh

Our results suggest that our batch active learning algorithm requires only a few queries that are computed in a short amount of time.

Active Learning Point Processes

RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback

no code implementations6 Feb 2024 YuFei Wang, Zhanyi Sun, Jesse Zhang, Zhou Xian, Erdem Biyik, David Held, Zackory Erickson

Reward engineering has long been a challenge in Reinforcement Learning (RL) research, as it often requires extensive human effort and iterative processes of trial-and-error to design effective reward functions.

reinforcement-learning Reinforcement Learning (RL)

Preference Elicitation with Soft Attributes in Interactive Recommendation

no code implementations22 Oct 2023 Erdem Biyik, Fan Yao, Yinlam Chow, Alex Haig, Chih-Wei Hsu, Mohammad Ghavamzadeh, Craig Boutilier

Leveraging concept activation vectors for soft attribute semantics, we develop novel preference elicitation methods that can accommodate soft attributes and bring together both item and attribute-based preference elicitation.

Attribute Recommendation Systems

Active Reward Learning from Online Preferences

no code implementations27 Feb 2023 Vivek Myers, Erdem Biyik, Dorsa Sadigh

Robot policies need to adapt to human preferences and/or new environments.

Assistive Teaching of Motor Control Tasks to Humans

1 code implementation25 Nov 2022 Megha Srivastava, Erdem Biyik, Suvir Mirchandani, Noah Goodman, Dorsa Sadigh

In this paper, we focus on the problem of assistive teaching of motor control tasks such as parking a car or landing an aircraft.

Reinforcement Learning (RL)

Learning Preferences for Interactive Autonomy

1 code implementation19 Oct 2022 Erdem Biyik

To this end, we first propose various forms of comparative feedback, e. g., pairwise comparisons, best-of-many choices, rankings, scaled comparisons; and describe how a robot can use these various forms of human feedback to infer a reward function, which may be parametric or non-parametric.

Active Learning Autonomous Driving +2

Leveraging Smooth Attention Prior for Multi-Agent Trajectory Prediction

no code implementations8 Mar 2022 Zhangjie Cao, Erdem Biyik, Guy Rosman, Dorsa Sadigh

At a certain time, to forecast a reasonable future trajectory, each agent needs to pay attention to the interactions with only a small group of most relevant agents instead of unnecessarily paying attention to all the other agents.

Trajectory Prediction

Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams

no code implementations2 Oct 2021 Erdem Biyik, Anusha Lalitha, Rajarshi Saha, Andrea Goldsmith, Dorsa Sadigh

Our results show that the proposed partner-aware strategy outperforms other known methods, and our human subject studies suggest humans prefer to collaborate with AI agents implementing our partner-aware strategy.

Decision Making

Learning Reward Functions from Scale Feedback

1 code implementation1 Oct 2021 Nils Wilde, Erdem Biyik, Dorsa Sadigh, Stephen L. Smith

Today's robots are increasingly interacting with people and need to efficiently learn inexperienced user's preferences.

Learning Multimodal Rewards from Rankings

no code implementations27 Sep 2021 Vivek Myers, Erdem Biyik, Nima Anari, Dorsa Sadigh

However, expert feedback is often assumed to be drawn from an underlying unimodal reward function.

APReL: A Library for Active Preference-based Reward Learning Algorithms

1 code implementation16 Aug 2021 Erdem Biyik, Aditi Talati, Dorsa Sadigh

Reward learning is a fundamental problem in human-robot interaction to have robots that operate in alignment with what their human user wants.

Emergent Prosociality in Multi-Agent Games Through Gifting

no code implementations13 May 2021 Woodrow Z. Wang, Mark Beliaev, Erdem Biyik, Daniel A. Lazar, Ramtin Pedarsani, Dorsa Sadigh

Coordination is often critical to forming prosocial behaviors -- behaviors that increase the overall sum of rewards received by all agents in a multi-agent game.

Incentivizing Routing Choices for Safe and Efficient Transportation in the Face of the COVID-19 Pandemic

no code implementations28 Dec 2020 Mark Beliaev, Erdem Biyik, Daniel A. Lazar, Woodrow Z. Wang, Dorsa Sadigh, Ramtin Pedarsani

In turn, significant increases in traffic congestion are expected, since people are likely to prefer using their own vehicles or taxis as opposed to riskier and more crowded options such as the railway.

Multi-Agent Safe Planning with Gaussian Processes

no code implementations10 Aug 2020 Zheqing Zhu, Erdem Biyik, Dorsa Sadigh

Multi-agent safe systems have become an increasingly important area of study as we can now easily have multiple AI-powered systems operating together.

Gaussian Processes

Reinforcement Learning based Control of Imitative Policies for Near-Accident Driving

1 code implementation1 Jul 2020 Zhangjie Cao, Erdem Biyik, Woodrow Z. Wang, Allan Raventos, Adrien Gaidon, Guy Rosman, Dorsa Sadigh

To address driving in near-accident scenarios, we propose a hierarchical reinforcement and imitation learning (H-ReIL) approach that consists of low-level policies learned by IL for discrete driving modes, and a high-level policy learned by RL that switches between different driving modes.

Autonomous Driving Imitation Learning +2

Learning Reward Functions from Diverse Sources of Human Feedback: Optimally Integrating Demonstrations and Preferences

no code implementations24 Jun 2020 Erdem Biyik, Dylan P. Losey, Malayandi Palan, Nicholas C. Landolfi, Gleb Shevchuk, Dorsa Sadigh

As designing reward functions can be extremely challenging, a more promising approach is to directly learn reward functions from human teachers.

Active Preference-Based Gaussian Process Regression for Reward Learning

1 code implementation6 May 2020 Erdem Biyik, Nicolas Huynh, Mykel J. Kochenderfer, Dorsa Sadigh

Our results in simulations and a user study suggest that our approach can efficiently learn expressive reward functions for robotics tasks.

regression

When Humans Aren't Optimal: Robots that Collaborate with Risk-Aware Humans

no code implementations13 Jan 2020 Minae Kwon, Erdem Biyik, Aditi Talati, Karan Bhasin, Dylan P. Losey, Dorsa Sadigh

Overall, we extend existing rational human models so that collaborative robots can anticipate and plan around suboptimal human behavior during HRI.

Batch Active Learning Using Determinantal Point Processes

1 code implementation19 Jun 2019 Erdem Biyik, Kenneth Wang, Nima Anari, Dorsa Sadigh

While active learning methods attempt to tackle this issue by labeling only the data samples that give high information, they generally suffer from large computational costs and are impractical in settings where data can be collected in parallel.

Active Learning Point Processes

Batch Active Preference-Based Learning of Reward Functions

1 code implementation10 Oct 2018 Erdem Biyik, Dorsa Sadigh

Data generation and labeling are usually an expensive part of learning for robotics.

Active Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.