Search Results for author: Peter Stone

Found 110 papers, 34 papers with code

Behavioral Cloning from Observation

5 code implementations • 4 May 2018 • Faraz Torabi, Garrett Warnell, Peter Stone

In this work, we propose a two-phase, autonomous imitation learning technique called behavioral cloning from observation (BCO), that aims to provide improved performance with respect to both of these aspects.

Imitation Learning

2,534

Paper
Code

Deep Reinforcement Learning in Parameterized Action Space

7 code implementations • 13 Nov 2015 • Matthew Hausknecht, Peter Stone

Recent work has shown that deep neural networks are capable of approximating both value functions and policies in reinforcement learning domains featuring continuous state and action spaces.

reinforcement-learning Reinforcement Learning (RL)

2,534

Paper
Code

Deep Recurrent Q-Learning for Partially Observable MDPs

5 code implementations • 23 Jul 2015 • Matthew Hausknecht, Peter Stone

Deep Reinforcement Learning has yielded proficient controllers for complex tasks.

Atari Games OpenAI Gym +1

579

Paper
Code

LLM+P: Empowering Large Language Models with Optimal Planning Proficiency

1 code implementation • 22 Apr 2023 • Bo Liu, Yuqian Jiang, Xiaohan Zhang, Qiang Liu, Shiqi Zhang, Joydeep Biswas, Peter Stone

LLM+P takes in a natural language description of a planning problem, then returns a correct (or optimal) plan for solving that problem in natural language.

Zero-shot Generalization

324

Paper
Code

Scalable Training of Artificial Neural Networks with Adaptive Sparse Connectivity inspired by Network Science

2 code implementations • 15 Jul 2017 • Decebal Constantin Mocanu, Elena Mocanu, Peter Stone, Phuong H. Nguyen, Madeleine Gibescu, Antonio Liotta

Through the success of deep learning in various domains, artificial neural networks are currently among the most used artificial intelligence methods.

236

Paper
Code

Conflict-Averse Gradient Descent for Multi-task Learning

3 code implementations • NeurIPS 2021 • Bo Liu, Xingchao Liu, Xiaojie Jin, Peter Stone, Qiang Liu

The goal of multi-task learning is to enable more efficient learning than single task learning by sharing model structures for a diverse set of tasks.

Multi-Task Learning

184

Paper
Code

COOPERNAUT: End-to-End Driving with Cooperative Perception for Networked Vehicles

1 code implementation • CVPR 2022 • Jiaxun Cui, Hang Qiu, Dian Chen, Peter Stone, Yuke Zhu

To evaluate our model, we develop AutoCastSim, a network-augmented driving simulation framework with example accident-prone scenarios.

Autonomous Driving

Paper
Code

FAMO: Fast Adaptive Multitask Optimization

1 code implementation • NeurIPS 2023 • Bo Liu, Yihao Feng, Peter Stone, Qiang Liu

One of the grand enduring goals of AI is to create generalist agents that can learn multiple different tasks from diverse data via multitask learning (MTL).

Computational Efficiency

Paper
Code

Reinforcement Learning for Optimization of COVID-19 Mitigation policies

1 code implementation • 20 Oct 2020 • Varun Kompella, Roberto Capobianco, Stacy Jong, Jonathan Browne, Spencer Fox, Lauren Meyers, Peter Wurman, Peter Stone

The year 2020 has seen the COVID-19 virus lead to one of the worst global pandemics in history.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Metric Residual Networks for Sample Efficient Goal-Conditioned Reinforcement Learning

2 code implementations • 17 Aug 2022 • Bo Liu, Yihao Feng, Qiang Liu, Peter Stone

Furthermore, we introduce the metric residual network (MRN) that deliberately decomposes the action-value function Q(s, a, g) into the negated summation of a metric plus a residual asymmetric component.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Generative Adversarial Imitation from Observation

1 code implementation • 17 Jul 2018 • Faraz Torabi, Garrett Warnell, Peter Stone

Imitation from observation (IfO) is the problem of learning directly from state-only demonstrations without having access to the demonstrator's actions.

Imitation Learning

Paper
Code

Firefly Neural Architecture Descent: a General Approach for Growing Neural Networks

1 code implementation • NeurIPS 2020 • Lemeng Wu, Bo Liu, Peter Stone, Qiang Liu

We propose firefly neural architecture descent, a general framework for progressively and dynamically growing neural networks to jointly optimize the networks' parameters and architectures.

Continual Learning Image Classification +1

Paper
Code

Benchmarking Reinforcement Learning Techniques for Autonomous Navigation

1 code implementation • 10 Oct 2022 • Zifan Xu, Bo Liu, Xuesu Xiao, Anirudh Nair, Peter Stone

Deep reinforcement learning (RL) has brought many successes for autonomous robot navigation.

Autonomous Navigation Benchmarking +3

Paper
Code

Causal Dynamics Learning for Task-Independent State Abstraction

1 code implementation • 27 Jun 2022 • Zizhao Wang, Xuesu Xiao, Zifan Xu, Yuke Zhu, Peter Stone

Learning dynamics models accurately is an important goal for Model-Based Reinforcement Learning (MBRL), but most MBRL methods learn a dense dynamics model which is vulnerable to spurious correlations and therefore generalizes poorly to unseen states.

Model-based Reinforcement Learning

Paper
Code

Improving Grounded Natural Language Understanding through Human-Robot Dialog

1 code implementation • 1 Mar 2019 • Jesse Thomason, Aishwarya Padmakumar, Jivko Sinapov, Nick Walker, Yuqian Jiang, Harel Yedidsion, Justin Hart, Peter Stone, Raymond J. Mooney

Natural language understanding for robotics can require substantial domain- and platform-specific engineering.

Natural Language Understanding

Paper
Code

Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team Composition

1 code implementation • 18 May 2021 • Bo Liu, Qiang Liu, Peter Stone, Animesh Garg, Yuke Zhu, Animashree Anandkumar

Specifically, we 1) adopt the attention mechanism for both the coach and the players; 2) propose a variational objective to regularize learning; and 3) design an adaptive communication method to let the coach decide when to communicate with the players.

Multi-agent Reinforcement Learning reinforcement-learning +3

Paper
Code

Dynamic Sparse Training for Deep Reinforcement Learning

1 code implementation • 8 Jun 2021 • Ghada Sokar, Elena Mocanu, Decebal Constantin Mocanu, Mykola Pechenizkiy, Peter Stone

In this paper, we introduce for the first time a dynamic sparse training approach for deep reinforcement learning to accelerate the training process.

Continuous Control Decision Making +3

Paper
Code

Continual Learning and Private Unlearning

1 code implementation • 24 Mar 2022 • Bo Liu, Qiang Liu, Peter Stone

As intelligent agents become autonomous over longer periods of time, they may eventually become lifelong counterparts to specific people.

Continual Learning

Paper
Code

Deep TAMER: Interactive Agent Shaping in High-Dimensional State Spaces

2 code implementations • 28 Sep 2017 • Garrett Warnell, Nicholas Waytowich, Vernon Lawhern, Peter Stone

While recent advances in deep reinforcement learning have allowed autonomous learning agents to succeed at a variety of complex tasks, existing algorithms generally require a lot of training data.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

The EMPATHIC Framework for Task Learning from Implicit Human Feedback

1 code implementation • 28 Sep 2020 • Yuchen Cui, Qiping Zhang, Alessandro Allievi, Peter Stone, Scott Niekum, W. Bradley Knox

We train a deep neural network on this data and demonstrate its ability to (1) infer relative reward ranking of events in the training task from prerecorded human facial reactions; (2) improve the policy of an agent in the training task using live human facial reactions; and (3) transfer to a novel domain in which it evaluates robot manipulation trajectories.

Human-Computer Interaction Robotics

Paper
Code

Importance Sampling Policy Evaluation with an Estimated Behavior Policy

1 code implementation • 4 Jun 2018 • Josiah P. Hanna, Scott Niekum, Peter Stone

We find that this estimator often lowers the mean squared error of off-policy evaluation compared to importance sampling with the true behavior policy or using a behavior policy that is estimated from a separate data set.

Off-policy evaluation

Paper
Code

Deterministic Implementations for Reproducibility in Deep Reinforcement Learning

1 code implementation • 15 Sep 2018 • Prabhat Nagarajan, Garrett Warnell, Peter Stone

One by one, we then allow individual sources of nondeterminism to affect our otherwise deterministic implementation, and measure the impact of each source on the variance in performance.

Q-Learning reinforcement-learning +1

Paper
Code

Scalable Multiagent Driving Policies For Reducing Traffic Congestion

1 code implementation • 26 Feb 2021 • Jiaxun Cui, William Macke, Harel Yedidsion, Daniel Urieli, Peter Stone

Next, we propose a modular transfer reinforcement learning approach, and use it to scale up a multiagent driving policy to outperform human-like traffic and existing approaches in a simulated realistic scenario, which is an order of magnitude larger than past scenarios (hundreds instead of tens of vehicles).

Transfer Reinforcement Learning

Paper
Code

DM$^2$: Decentralized Multi-Agent Reinforcement Learning for Distribution Matching

1 code implementation • 1 Jun 2022 • Caroline Wang, Ishan Durugkar, Elad Liebman, Peter Stone

The theoretical analysis shows that under certain conditions, each agent minimizing its individual distribution mismatch allows the convergence to the joint policy that generated the target distribution.

Multi-agent Reinforcement Learning reinforcement-learning +2

Paper
Code

Lucid Dreaming for Experience Replay: Refreshing Past States with the Current Policy

1 code implementation • 29 Sep 2020 • Yunshu Du, Garrett Warnell, Assefaw Gebremedhin, Peter Stone, Matthew E. Taylor

In this work, we introduce Lucid Dreaming for Experience Replay (LiDER), a conceptually new framework that allows replay experiences to be refreshed by leveraging the agent's current policy.

Atari Games Reinforcement Learning (RL)

Paper
Code

t-DGR: A Trajectory-Based Deep Generative Replay Method for Continual Learning in Decision Making

1 code implementation • 4 Jan 2024 • William Yue, Bo Liu, Peter Stone

Deep generative replay has emerged as a promising approach for continual learning in decision-making tasks.

Continual Learning Decision Making

Paper
Code

Adversarial Intrinsic Motivation for Reinforcement Learning

1 code implementation • NeurIPS 2021 • Ishan Durugkar, Mauricio Tec, Scott Niekum, Peter Stone

In this paper, we investigate whether one such objective, the Wasserstein-1 distance between a policy's state visitation distribution and a target distribution, can be utilized effectively for reinforcement learning (RL) tasks.

Multi-Goal Reinforcement Learning reinforcement-learning +1

Paper
Code

Learning a Robust Multiagent Driving Policy for Traffic Congestion Reduction

1 code implementation • 3 Dec 2021 • Yulin Zhang, William Macke, Jiaxun Cui, Daniel Urieli, Peter Stone

This article establishes for the first time that a multiagent driving policy can be trained in such a way that it generalizes to different traffic flows, AV penetration, and road geometries, including on multi-lane roads.

Autonomous Vehicles

Paper
Code

Multistep Inverse Is Not All You Need

1 code implementation • 18 Mar 2024 • Alexander Levine, Peter Stone, Amy Zhang

In this work, we consider the Ex-BMDP model, first proposed by Efroni et al. (2022), which formalizes control problems where observations can be factorized into an action-dependent latent state which evolves deterministically, and action-independent time-correlated noise.

Paper
Code

A Scavenger Hunt for Service Robots

1 code implementation • 9 Mar 2021 • Harel Yedidsion, Jennifer Suriadinata, Zifan Xu, Stefan Debruyn, Peter Stone

In this problem, the goal is to find a set of objects as quickly as possible, given probability distributions of where they may be found.

Reinforcement Learning (RL)

Paper
Code

Data-Efficient Policy Evaluation Through Behavior Policy Search

1 code implementation • ICML 2017 • Josiah P. Hanna, Philip S. Thomas, Peter Stone, Scott Niekum

The standard unbiased technique for evaluating a policy is to deploy the policy and observe its performance.

Paper
Code

Learning Curriculum Policies for Reinforcement Learning

1 code implementation • 1 Dec 2018 • Sanmit Narvekar, Peter Stone

Curriculum learning in reinforcement learning is a training methodology that seeks to speed up learning of a difficult target task, by first training on a series of simpler tasks and transferring the knowledge acquired to the target task.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Learning Optimal Advantage from Preferences and Mistaking it for Reward

1 code implementation • 3 Oct 2023 • W. Bradley Knox, Stephane Hatgis-Kessell, Sigurdur Orn Adalgeirsson, Serena Booth, Anca Dragan, Peter Stone, Scott Niekum

Most recent work assumes that human preferences are generated based only upon the reward accrued within those segments, or their partial return.

Paper
Code

Task Planning in Robotics: an Empirical Comparison of PDDL-based and ASP-based Systems

no code implementations • 23 Apr 2018 • Yuqian Jiang, Shiqi Zhang, Piyush Khandelwal, Peter Stone

PDDL is designed for task planning, and PDDL-based planners are widely used for a variety of planning problems.

Robot Task Planning

Paper
Add Code

Autonomous Agents Modelling Other Agents: A Comprehensive Survey and Open Problems

no code implementations • 23 Sep 2017 • Stefano V. Albrecht, Peter Stone

Much research in artificial intelligence is concerned with the development of autonomous agents that can interact effectively with other agents.

Paper
Add Code

Traffic Optimization For a Mixture of Self-interested and Compliant Agents

no code implementations • 27 Sep 2017 • Guni Sharon, Michael Albert, Tarun Rambha, Stephen Boyles, Peter Stone

This paper focuses on two commonly used path assignment policies for agents traversing a congested network: self-interested routing, and system-optimum routing.

Paper
Add Code

Bootstrapping with Models: Confidence Intervals for Off-Policy Evaluation

no code implementations • 20 Jun 2016 • Josiah P. Hanna, Peter Stone, Scott Niekum

In this context, we propose two bootstrapping off-policy evaluation methods which use learned MDP transition models in order to estimate lower confidence bounds on policy performance with limited data in both continuous and discrete state spaces.

Off-policy evaluation

Paper
Add Code

Online Contrastive Divergence with Generative Replay: Experience Replay without Storing Data

no code implementations • 18 Oct 2016 • Decebal Constantin Mocanu, Maria Torres Vega, Eric Eaton, Peter Stone, Antonio Liotta

Conceived in the early 1990s, Experience Replay (ER) has been shown to be a successful mechanism to allow online learning algorithms to reuse past experiences.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Representative Selection in Non Metric Datasets

no code implementations • 26 Feb 2015 • Elad Liebman, Benny Chor, Peter Stone

This paper considers the problem of representative selection: choosing a subset of data points from a dataset that best represents its overall set of elements.

Clustering

Paper
Add Code

DJ-MC: A Reinforcement-Learning Agent for Music Playlist Recommendation

no code implementations • 9 Jan 2014 • Elad Liebman, Maytal Saar-Tsechansky, Peter Stone

In this work we present DJ-MC, a novel reinforcement-learning framework for music recommendation that does not recommend songs individually but rather song sequences, or playlists, based on a model of preferences for both songs and song transitions.

Music Recommendation Recommendation Systems +2

Paper
Add Code

A Century Long Commitment to Assessing Artificial Intelligence and its Impact on Society

no code implementations • 23 Aug 2018 • Barbara J. Grosz, Peter Stone

In September 2016, Stanford's "One Hundred Year Study on Artificial Intelligence" project (AI100) issued the first report of its planned long-term periodic assessment of artificial intelligence (AI) and its impact on society.

Paper
Add Code

Learning a Policy for Opportunistic Active Learning

no code implementations • EMNLP 2018 • Aishwarya Padmakumar, Peter Stone, Raymond J. Mooney

Active learning identifies data points to label that are expected to be the most useful in improving a supervised model.

Active Learning Object +3

Paper
Add Code

Robot Representation and Reasoning with Knowledge from Reinforcement Learning

no code implementations • 28 Sep 2018 • Keting Lu, Shiqi Zhang, Peter Stone, Xiaoping Chen

In this work, we integrate logical-probabilistic KRR with model-based RL, enabling agents to simultaneously reason with declarative knowledge and learn from interaction experiences.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Integrating Task-Motion Planning with Reinforcement Learning for Robust Decision Making in Mobile Robots

no code implementations • 21 Nov 2018 • Yuqian Jiang, Fangkai Yang, Shiqi Zhang, Peter Stone

In the outer loop, the plan is executed, and the robot learns from the execution experience via model-free RL, to further improve its task-motion plans.

Decision Making Motion Planning +2

Paper
Add Code

HR-TD: A Regularized TD Method to Avoid Over-Generalization

no code implementations • ICLR 2019 • Ishan Durugkar, Bo Liu, Peter Stone

Temporal Difference learning with function approximation has been widely used recently and has led to several successful results.

Paper
Add Code

TD Learning with Constrained Gradients

no code implementations • ICLR 2018 • Ishan Durugkar, Peter Stone

In this work we propose a constraint on the TD update that minimizes change to the target values.

Q-Learning

Paper
Add Code

Escape Room: A Configurable Testbed for Hierarchical Reinforcement Learning

no code implementations • 22 Dec 2018 • Jacob Menashe, Peter Stone

We show that the ERD presents a suite of challenges with scalable difficulty to provide a smooth learning gradient from Taxi to the Arcade Learning Environment.

Hierarchical Reinforcement Learning Montezuma's Revenge +2

Paper
Add Code

Imitation Learning from Video by Leveraging Proprioception

no code implementations • 22 May 2019 • Faraz Torabi, Garrett Warnell, Peter Stone

Classically, imitation learning algorithms have been developed for idealized situations, e. g., the demonstrations are often required to be collected in the exact same environment and usually include the demonstrator's actions.

Imitation Learning

Paper
Add Code

Recent Advances in Imitation Learning from Observation

no code implementations • 30 May 2019 • Faraz Torabi, Garrett Warnell, Peter Stone

Imitation learning is the process by which one agent tries to learn how to perform a certain task using information generated by another, often more-expert agent performing that same task.

Imitation Learning

Paper
Add Code

Sample-efficient Adversarial Imitation Learning from Observation

no code implementations • 18 Jun 2019 • Faraz Torabi, Sean Geiger, Garrett Warnell, Peter Stone

We test our algorithm and conduct experiments using an imitation task on a physical robot arm and its simulated version in Gazebo and will show the improvement in learning rate and efficiency.

Imitation Learning Reinforcement Learning (RL)

Paper
Add Code

RIDM: Reinforced Inverse Dynamics Modeling for Learning from a Single Observed Demonstration

no code implementations • 18 Jun 2019 • Brahma S. Pavse, Faraz Torabi, Josiah P. Hanna, Garrett Warnell, Peter Stone

Augmenting reinforcement learning with imitation learning is often hailed as a method by which to improve upon learning from scratch.

Imitation Learning reinforcement-learning +1

Paper
Add Code

Solving Service Robot Tasks: UT Austin Villa@Home 2019 Team Report

no code implementations • 14 Sep 2019 • Rishi Shah, Yuqian Jiang, Haresh Karnan, Gilberto Briscoe-Martinez, Dominick Mulder, Ryan Gupta, Rachel Schlossman, Marika Murphy, Justin W. Hart, Luis Sentis, Peter Stone

RoboCup@Home is an international robotics competition based on domestic tasks requiring autonomous capabilities pertaining to a large variety of AI technologies.

Paper
Add Code

Leveraging Human Guidance for Deep Reinforcement Learning Tasks

no code implementations • 21 Sep 2019 • Ruohan Zhang, Faraz Torabi, Lin Guan, Dana H. Ballard, Peter Stone

Reinforcement learning agents can learn to solve sequential decision tasks by interacting with the environment.

Imitation Learning reinforcement-learning +1

Paper
Add Code

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

no code implementations • 10 Mar 2020 • Sanmit Narvekar, Bei Peng, Matteo Leonetti, Jivko Sinapov, Matthew E. Taylor, Peter Stone

Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

APPLD: Adaptive Planner Parameter Learning from Demonstration

no code implementations • 31 Mar 2020 • Xuesu Xiao, Bo Liu, Garrett Warnell, Jonathan Fink, Peter Stone

Existing autonomous robot navigation systems allow robots to move from one point to another in a collision-free manner.

Robot Navigation

Paper
Add Code

iCORPP: Interleaved Commonsense Reasoning and Probabilistic Planning on Robots

no code implementations • 18 Apr 2020 • Shiqi Zhang, Piyush Khandelwal, Peter Stone

Robot sequential decision-making in the real world is a challenge because it requires the robots to simultaneously reason about the current world state and dynamics, while planning actions to accomplish complex tasks.

Decision Making Management

Paper
Add Code

Learning and Reasoning for Robot Dialog and Navigation Tasks

no code implementations • SIGDIAL (ACL) 2020 • Keting Lu, Shiqi Zhang, Peter Stone, Xiaoping Chen

More interestingly, the robot was able to learn from navigation tasks to improve its dialog strategies.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Deep R-Learning for Continual Area Sweeping

no code implementations • 31 May 2020 • Rishi Shah, Yuqian Jiang, Justin Hart, Peter Stone

Coverage path planning is a well-studied problem in robotics in which a robot must plan a path that passes through every point in a given area repeatedly, usually with a uniform frequency.

Paper
Add Code

Artificial Musical Intelligence: A Survey

no code implementations • 17 Jun 2020 • Elad Liebman, Peter Stone

Computers have been used to analyze and create music since they were first introduced in the 1950s and 1960s.

BIG-bench Machine Learning Music Recommendation +1

Paper
Add Code

Temporal-Logic-Based Reward Shaping for Continuing Reinforcement Learning Tasks

no code implementations • 3 Jul 2020 • Yuqian Jiang, Sudarshanan Bharadwaj, Bo Wu, Rishi Shah, Ufuk Topcu, Peter Stone

Reward shaping is a common approach for incorporating domain knowledge into reinforcement learning in order to speed up convergence to an optimal policy.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch

no code implementations • NeurIPS 2020 • Siddharth Desai, Ishan Durugkar, Haresh Karnan, Garrett Warnell, Josiah Hanna, Peter Stone

We examine the problem of transferring a policy learned in a source environment to a target environment with different dynamics, particularly in the case where it is critical to reduce the amount of interaction with the target environment during learning.

Transfer Learning

Paper
Add Code

Reducing Sampling Error in Batch Temporal Difference Learning

no code implementations • ICML 2020 • Brahma Pavse, Ishan Durugkar, Josiah Hanna, Peter Stone

In this batch setting, we show that TD(0) may converge to an inaccurate value function because the update following an action is weighted according to the number of times that action occurred in the batch -- not the true probability of the action under the given policy.

Paper
Add Code

A Coach-Player Framework for Dynamic Team Composition

no code implementations • 1 Jan 2021 • Bo Liu, Qiang Liu, Peter Stone, Animesh Garg, Yuke Zhu, Anima Anandkumar

The performance of our method is comparable or even better than the setting where all players have a full view of the environment, but no coach.

Zero-shot Generalization

Paper
Add Code

Machine versus Human Attention in Deep Reinforcement Learning Tasks

no code implementations • NeurIPS 2021 • Sihang Guo, Ruohan Zhang, Bo Liu, Yifeng Zhu, Mary Hayhoe, Dana Ballard, Peter Stone

1) How similar are the visual representations learned by RL agents and humans when performing the same task?

Atari Games reinforcement-learning +1

Paper
Add Code

Expected Value of Communication for Planning in Ad Hoc Teamwork

no code implementations • 1 Mar 2021 • William Macke, Reuth Mirsky, Peter Stone

We then present a novel planning algorithm for ad hoc teamwork, determining which query to ask and planning accordingly.

Paper
Add Code

DEALIO: Data-Efficient Adversarial Learning for Imitation from Observation

no code implementations • 31 Mar 2021 • Faraz Torabi, Garrett Warnell, Peter Stone

In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.

Imitation Learning Model-based Reinforcement Learning +2

Paper
Add Code

Sequential Online Chore Division for Autonomous Vehicle Convoy Formation

no code implementations • 9 Apr 2021 • Harel Yedidsion, Shani Alkoby, Peter Stone

Chore division is a class of fair division problems in which some undesirable "resource" must be shared among a set of participants, with each participant wanting to get as little as possible.

Paper
Add Code

Skeletal Feature Compensation for Imitation Learning with Embodiment Mismatch

no code implementations • 15 Apr 2021 • Eddy Hudson, Garrett Warnell, Faraz Torabi, Peter Stone

Learning from demonstrations in the wild (e. g. YouTube videos) is a tantalizing goal in imitation learning.

Imitation Learning

Paper
Add Code

Reward (Mis)design for Autonomous Driving

no code implementations • 28 Apr 2021 • W. Bradley Knox, Alessandro Allievi, Holger Banzhaf, Felix Schmitt, Peter Stone

This article considers the problem of diagnosing certain common errors in reward design.

Autonomous Driving reinforcement-learning +1

Paper
Add Code

RAIL: A modular framework for Reinforcement-learning-based Adversarial Imitation Learning

no code implementations • 8 May 2021 • Eddy Hudson, Garrett Warnell, Peter Stone

While Adversarial Imitation Learning (AIL) algorithms have recently led to state-of-the-art results on various imitation learning benchmarks, it is unclear as to what impact various design decisions have on performance.

Imitation Learning OpenAI Gym +2

Paper
Add Code

VOILA: Visual-Observation-Only Imitation Learning for Autonomous Navigation

no code implementations • 19 May 2021 • Haresh Karnan, Garrett Warnell, Xuesu Xiao, Peter Stone

Is imitation learning for vision based autonomous navigation even possible in such scenarios?

Autonomous Navigation Imitation Learning +2

Paper
Add Code

Conflict Avoidance in Social Navigation -- a Survey

no code implementations • 23 Jun 2021 • Reuth Mirsky, Xuesu Xiao, Justin Hart, Peter Stone

This survey aims to bridge this gap by introducing such a common language, using it to survey existing work, and highlighting open problems.

Social Navigation

Paper
Add Code

Recent Advances in Leveraging Human Guidance for Sequential Decision-Making Tasks

no code implementations • 13 Jul 2021 • Ruohan Zhang, Faraz Torabi, Garrett Warnell, Peter Stone

A longstanding goal of artificial intelligence is to create artificial agents capable of learning to perform tasks that require sequential decision making.

Decision Making

Paper
Add Code

Bottom-Up Skill Discovery from Unsegmented Demonstrations for Long-Horizon Robot Manipulation

no code implementations • 28 Sep 2021 • Yifeng Zhu, Peter Stone, Yuke Zhu

From the task structures of multi-task demonstrations, we identify skills based on the recurring patterns and train goal-conditioned sensorimotor policies with hierarchical imitation learning.

Imitation Learning Robot Manipulation

Paper
Add Code

Generalizing Curricula for Reinforcement Learning

no code implementations • ICML Workshop LifelongML 2020 • Sanmit Narvekar, Peter Stone

However, there is structure that can be exploited between tasks and agents, such that knowledge gained developing a curriculum for one task can be reused to speed up creating a curriculum for a new task.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Real-world challenges for multi-agent reinforcement learning in grid-interactive buildings

no code implementations • 25 Nov 2021 • Kingsley Nweye, Bo Liu, Peter Stone, Zoltan Nagy

Building upon prior research that highlighted the need for standardizing environments for building control research, and inspired by recently introduced challenges for real life reinforcement learning control, here we propose a non-exhaustive set of nine real world challenges for reinforcement learning control in grid-interactive buildings.

Model Predictive Control Multi-agent Reinforcement Learning +2

Paper
Add Code

Adversarial Imitation Learning from Video using a State Observer

no code implementations • 1 Feb 2022 • Haresh Karnan, Garrett Warnell, Faraz Torabi, Peter Stone

The imitation learning research community has recently made significant progress towards the goal of enabling artificial agents to imitate behaviors from video demonstrations alone.

Continuous Control Imitation Learning

Paper
Add Code

Learning a Shield from Catastrophic Action Effects: Never Repeat the Same Mistake

no code implementations • 19 Feb 2022 • Shahaf S. Shperberg, Bo Liu, Peter Stone

When humans make catastrophic mistakes, they are expected to learn never to repeat them, such as a toddler who touches a hot stove and immediately learns never to do so again.

Continual Learning Safe Reinforcement Learning

Paper
Add Code

A Survey of Ad Hoc Teamwork Research

no code implementations • 16 Feb 2022 • Reuth Mirsky, Ignacio Carlucho, Arrasy Rahman, Elliot Fosong, William Macke, Mohan Sridharan, Peter Stone, Stefano V. Albrecht

Ad hoc teamwork is the research problem of designing agents that can collaborate with new teammates without prior coordination.

Paper
Add Code

Socially Compliant Navigation Dataset (SCAND): A Large-Scale Dataset of Demonstrations for Social Navigation

no code implementations • 28 Mar 2022 • Haresh Karnan, Anirudh Nair, Xuesu Xiao, Garrett Warnell, Soeren Pirk, Alexander Toshev, Justin Hart, Joydeep Biswas, Peter Stone

Social navigation is the capability of an autonomous agent, such as a robot, to navigate in a 'socially compliant' manner in the presence of other intelligent agents such as humans.

Imitation Learning Navigate +1

Paper
Add Code

VI-IKD: High-Speed Accurate Off-Road Navigation using Learned Visual-Inertial Inverse Kinodynamics

no code implementations • 30 Mar 2022 • Haresh Karnan, Kavan Singh Sikand, Pranav Atreya, Sadegh Rabiee, Xuesu Xiao, Garrett Warnell, Peter Stone, Joydeep Biswas

In this paper, we hypothesize that to enable accurate high-speed off-road navigation using a learned IKD model, in addition to inertial information from the past, one must also anticipate the kinodynamic interactions of the vehicle with the terrain in the future.

Paper
Add Code

Effective Mutation Rate Adaptation through Group Elite Selection

no code implementations • 11 Apr 2022 • Akarsh Kumar, Bo Liu, Risto Miikkulainen, Peter Stone

GESMR co-evolves a population of solutions and a population of MRs, such that each MR is assigned to a group of solutions.

Evolutionary Algorithms Image Classification

Paper
Add Code

Models of human preference for learning reward functions

no code implementations • 5 Jun 2022 • W. Bradley Knox, Stephane Hatgis-Kessell, Serena Booth, Scott Niekum, Peter Stone, Alessandro Allievi

We empirically show that our proposed regret preference model outperforms the partial return preference model with finite training data in otherwise the same setting.

Decision Making reinforcement-learning

Paper
Add Code

Value Function Decomposition for Iterative Design of Reinforcement Learning Agents

no code implementations • 24 Jun 2022 • James Macglashan, Evan Archer, Alisa Devlic, Takuma Seno, Craig Sherstan, Peter R. Wurman, Peter Stone

These value estimates provide insight into an agent's learning and decision-making process and enable new training methods to mitigate common problems.

Decision Making reinforcement-learning +1

Paper
Add Code

BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach

no code implementations • 19 Sep 2022 • Mao Ye, Bo Liu, Stephen Wright, Peter Stone, Qiang Liu

Bilevel optimization (BO) is useful for solving a variety of important machine learning problems including but not limited to hyperparameter optimization, meta-learning, continual learning, and reinforcement learning.

Bilevel Optimization Continual Learning +3

Paper
Add Code

Task Phasing: Automated Curriculum Learning from Demonstrations

1 code implementation • 20 Oct 2022 • Vaibhav Bajaj, Guni Sharon, Peter Stone

Applying reinforcement learning (RL) to sparse reward domains is notoriously challenging due to insufficient guiding signals.

Reinforcement Learning (RL)

Paper
Code

D-Shape: Demonstration-Shaped Reinforcement Learning via Goal Conditioning

no code implementations • 26 Oct 2022 • Caroline Wang, Garrett Warnell, Peter Stone

While combining imitation learning (IL) and reinforcement learning (RL) is a promising way to address poor sample efficiency in autonomous behavior acquisition, methods that do so typically assume that the requisite behavior demonstrations are provided by an expert that behaves optimally with respect to a task reward.

Imitation Learning reinforcement-learning +1

Paper
Add Code

Event Tables for Efficient Experience Replay

no code implementations • 1 Nov 2022 • Varun Kompella, Thomas J. Walsh, Samuel Barrett, Peter Wurman, Peter Stone

Experience replay (ER) is a crucial component of many deep reinforcement learning (RL) systems.

Car Racing reinforcement-learning +1

Paper
Add Code

ABC: Adversarial Behavioral Cloning for Offline Mode-Seeking Imitation Learning

no code implementations • 8 Nov 2022 • Eddy Hudson, Ishan Durugkar, Garrett Warnell, Peter Stone

Given a dataset of expert agent interactions with an environment of interest, a viable method to extract an effective agent policy is to estimate the maximum likelihood policy indicated by this data.

Generative Adversarial Network Imitation Learning

Paper
Add Code

Artificial Intelligence and Life in 2030: The One Hundred Year Study on Artificial Intelligence

no code implementations • 31 Oct 2022 • Peter Stone, Rodney Brooks, Erik Brynjolfsson, Ryan Calo, Oren Etzioni, Greg Hager, Julia Hirschberg, Shivaram Kalyanakrishnan, Ece Kamar, Sarit Kraus, Kevin Leyton-Brown, David Parkes, William Press, AnnaLee Saxenian, Julie Shah, Milind Tambe, Astro Teller

Paper
Add Code

Safe Evaluation For Offline Learning: Are We Ready To Deploy?

no code implementations • 16 Dec 2022 • Hager Radi, Josiah P. Hanna, Peter Stone, Matthew E. Taylor

In our setting, we assume a source of data, which we split into a train-set, to learn an offline policy, and a test-set, to estimate a lower-bound on the offline policy using off-policy evaluation with bootstrapping.

Off-policy evaluation

Paper
Add Code

A Domain-Agnostic Approach for Characterization of Lifelong Learning Systems

no code implementations • 18 Jan 2023 • Megan M. Baker, Alexander New, Mario Aguilar-Simon, Ziad Al-Halah, Sébastien M. R. Arnold, Ese Ben-Iwhiwhu, Andrew P. Brna, Ethan Brooks, Ryan C. Brown, Zachary Daniels, Anurag Daram, Fabien Delattre, Ryan Dellana, Eric Eaton, Haotian Fu, Kristen Grauman, Jesse Hostetler, Shariq Iqbal, Cassandra Kent, Nicholas Ketz, Soheil Kolouri, George Konidaris, Dhireesha Kudithipudi, Erik Learned-Miller, Seungwon Lee, Michael L. Littman, Sandeep Madireddy, Jorge A. Mendez, Eric Q. Nguyen, Christine D. Piatko, Praveen K. Pilly, Aswin Raghavan, Abrar Rahman, Santhosh Kumar Ramakrishnan, Neale Ratzlaff, Andrea Soltoggio, Peter Stone, Indranil Sur, Zhipeng Tang, Saket Tiwari, Kyle Vedder, Felix Wang, Zifan Xu, Angel Yanguas-Gil, Harel Yedidsion, Shangqun Yu, Gautam K. Vallabha

Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed.

Paper
Add Code

Causal Policy Gradient for Whole-Body Mobile Manipulation

no code implementations • 4 May 2023 • Jiaheng Hu, Peter Stone, Roberto Martín-Martín

Current approaches often segregate tasks into navigation without manipulation and stationary manipulation without locomotion by manually matching parts of the action space to MoMa sub-objectives (e. g. learning base actions for locomotion objectives and learning arm actions for manipulation).

Paper
Add Code

Composing Efficient, Robust Tests for Policy Selection

no code implementations • 12 Jun 2023 • Dustin Morrill, Thomas J. Walsh, Daniel Hernandez, Peter R. Wurman, Peter Stone

Empirical results demonstrate that RPOSST finds a small set of test cases that identify high quality policies in a toy one-shot game, poker datasets, and a high-fidelity racing simulator.

Paper
Add Code

Principles and Guidelines for Evaluating Social Robot Navigation Algorithms

no code implementations • 29 Jun 2023 • Anthony Francis, Claudia Pérez-D'Arpino, Chengshu Li, Fei Xia, Alexandre Alahi, Rachid Alami, Aniket Bera, Abhijat Biswas, Joydeep Biswas, Rohan Chandra, Hao-Tien Lewis Chiang, Michael Everett, Sehoon Ha, Justin Hart, Jonathan P. How, Haresh Karnan, Tsang-Wei Edward Lee, Luis J. Manso, Reuth Mirksy, Sören Pirk, Phani Teja Singamaneni, Peter Stone, Ada V. Taylor, Peter Trautman, Nathan Tsoi, Marynel Vázquez, Xuesu Xiao, Peng Xu, Naoki Yokoyama, Alexander Toshev, Roberto Martín-Martín

A major challenge to deploying robots widely is navigation in human-populated environments, commonly referred to as social robot navigation.

Benchmarking Social Navigation

Paper
Add Code

Minimum Coverage Sets for Training Robust Ad Hoc Teamwork Agents

no code implementations • 18 Aug 2023 • Arrasy Rahman, Jiaxun Cui, Peter Stone

In this work, we first propose that maximizing an AHT agent's robustness requires it to emulate policies in the minimum coverage set (MCS), the set of best-response policies to any partner policies in the environment.

Paper
Add Code

Utilizing Mood-Inducing Background Music in Human-Robot Interaction

no code implementations • 28 Aug 2023 • Elad Liebman, Peter Stone

This research fills this gap by reporting the results of an experiment in which human participants were required to complete a task in the presence of an autonomous agent while listening to background music.

Decision Making

Paper
Add Code

Wait, That Feels Familiar: Learning to Extrapolate Human Preferences for Preference Aligned Path Planning

no code implementations • 18 Sep 2023 • Haresh Karnan, Elvin Yang, Garrett Warnell, Joydeep Biswas, Peter Stone

In this work, we posit that operator preferences for visually novel terrains, which the robot should adhere to, can often be extrapolated from established terrain references within the inertial, proprioceptive, and tactile domain.

Navigate Robot Navigation +1

Paper
Add Code

STERLING: Self-Supervised Terrain Representation Learning from Unconstrained Robot Experience

no code implementations • 26 Sep 2023 • Haresh Karnan, Elvin Yang, Daniel Farkash, Garrett Warnell, Joydeep Biswas, Peter Stone

Terrain awareness, i. e., the ability to identify and distinguish different types of terrain, is a critical ability that robots must have to succeed at autonomous off-road navigation.

Representation Learning Visual Navigation

Paper
Add Code

$f$-Policy Gradients: A General Framework for Goal Conditioned RL using $f$-Divergences

no code implementations • 10 Oct 2023 • Siddhant Agarwal, Ishan Durugkar, Peter Stone, Amy Zhang

We further introduce an entropy-regularized policy optimization objective, that we call $state$-MaxEnt RL (or $s$-MaxEnt RL) as a special case of our objective.

Efficient Exploration Policy Gradient Methods +1

Paper
Add Code

Dobby: A Conversational Service Robot Driven by GPT-4

no code implementations • 10 Oct 2023 • Carson Stark, Bohkyung Chun, Casey Charleston, Varsha Ravi, Luis Pabon, Surya Sunkari, Tarun Mohan, Peter Stone, Justin Hart

This work introduces a robotics platform which embeds a conversational AI agent in an embodied system for natural language understanding and intelligent decision-making for service tasks; integrating task planning and human-like conversation.

Decision Making General Knowledge +3

Paper
Add Code

Learning Generalizable Manipulation Policies with Object-Centric 3D Representations

no code implementations • 22 Oct 2023 • Yifeng Zhu, Zhenyu Jiang, Peter Stone, Yuke Zhu

We introduce GROOT, an imitation learning method for learning robust policies with object-centric and 3D priors.

Imitation Learning Object

Paper
Add Code

ICRA Roboethics Challenge 2023: Intelligent Disobedience in an Elderly Care Home

no code implementations • 15 Nov 2023 • Sveta Paster, Kantwon Rogers, Gordon Briggs, Peter Stone, Reuth Mirsky

With the projected surge in the elderly population, service robots offer a promising avenue to enhance their well-being in elderly care homes.

Paper
Add Code

Latent Skill Discovery for Chain-of-Thought Reasoning

no code implementations • 7 Dec 2023 • Zifan Xu, Haozhu Wang, Dmitriy Bespalov, Peter Stone, Yanjun Qi

Simultaneously, RSD learns a reasoning policy to determine the required reasoning skill for a given question.

Math

Paper
Add Code

Building Minimal and Reusable Causal State Abstractions for Reinforcement Learning

no code implementations • 23 Jan 2024 • Zizhao Wang, Caroline Wang, Xuesu Xiao, Yuke Zhu, Peter Stone

Two desiderata of reinforcement learning (RL) algorithms are the ability to learn from relatively little experience and the ability to learn policies that generalize to a range of problem specifications.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks

no code implementations • 3 Mar 2024 • Ziping Xu, Zifan Xu, Runxuan Jiang, Peter Stone, Ambuj Tewari

Multitask Reinforcement Learning (MTRL) approaches have gained increasing attention for its wide applications in many important Reinforcement Learning (RL) tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Dexterous Legged Locomotion in Confined 3D Spaces with Reinforcement Learning

no code implementations • 6 Mar 2024 • Zifan Xu, Amir Hossain Raj, Xuesu Xiao, Peter Stone

To address the inefficiency of tracking distant navigation goals, we introduce a hierarchical locomotion controller that combines a classical planner tasked with planning waypoints to reach a faraway global goal location, and an RL-based policy trained to follow these waypoints by generating low-level motion commands.

Navigate reinforcement-learning +1

Paper
Add Code

TeleMoMa: A Modular and Versatile Teleoperation System for Mobile Manipulation

no code implementations • 12 Mar 2024 • Shivin Dass, Wensi Ai, Yuqian Jiang, Samik Singh, Jiaheng Hu, Ruohan Zhang, Peter Stone, Ben Abbatematteo, Roberto Martín-Martín

This problem is more severe in mobile manipulation, where collecting demonstrations is harder than in stationary manipulation due to the lack of available and easy-to-use teleoperation interfaces.

Imitation Learning

Paper
Add Code

Dyna-LfLH: Learning Agile Navigation in Dynamic Environments from Learned Hallucination

no code implementations • 25 Mar 2024 • Saad Abdul Ghani, Zizhao Wang, Peter Stone, Xuesu Xiao

In our new Dynamic Learning from Learned Hallucination (Dyna-LfLH), we design and learn a novel latent distribution and sample dynamic obstacles from it, so the generated training data can be used to learn a motion planner to navigate in dynamic environments.

Hallucination Imitation Learning +2

Paper
Add Code

N-Agent Ad Hoc Teamwork

no code implementations • 16 Apr 2024 • Caroline Wang, Arrasy Rahman, Ishan Durugkar, Elad Liebman, Peter Stone

POAM is a policy gradient, multi-agent reinforcement learning approach to the NAHT problem, that enables adaptation to diverse teammate behaviors by learning representations of teammate behaviors.

Autonomous Driving Multi-agent Reinforcement Learning +4

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.