Search Results for author: Dorsa Sadigh

Found 96 papers, 36 papers with code

Explore until Confident: Efficient Exploration for Embodied Question Answering

no code implementations • 23 Mar 2024 • Allen Z. Ren, Jaden Clark, Anushri Dixit, Masha Itkina, Anirudha Majumdar, Dorsa Sadigh

We consider the problem of Embodied Question Answering (EQA), which refers to settings where an embodied agent such as a robot needs to actively explore an environment to gather information until it is confident about the answer to a question.

Conformal Prediction Efficient Exploration +3

Paper
Add Code

Efficient Data Collection for Robotic Manipulation via Compositional Generalization

no code implementations • 8 Mar 2024 • Jensen Gao, Annie Xie, Ted Xiao, Chelsea Finn, Dorsa Sadigh

Recent works on large-scale robotic data collection typically vary a wide range of environmental factors during data collection, such as object types and table textures.

Imitation Learning

Paper
Add Code

RT-H: Action Hierarchies Using Language

no code implementations • 4 Mar 2024 • Suneel Belkhale, Tianli Ding, Ted Xiao, Pierre Sermanet, Quon Vuong, Jonathan Tompson, Yevgen Chebotar, Debidatta Dwibedi, Dorsa Sadigh

Predicting these language motions as an intermediate step between tasks and actions forces the policy to learn the shared structure of low-level motions across seemingly disparate tasks.

Imitation Learning

Paper
Add Code

Batch Active Learning of Reward Functions from Human Preferences

no code implementations • 24 Feb 2024 • Erdem Biyik, Nima Anari, Dorsa Sadigh

Our results suggest that our batch active learning algorithm requires only a few queries that are computed in a short amount of time.

Active Learning Point Processes

Paper
Add Code

Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models

2 code implementations • 12 Feb 2024 • Siddharth Karamcheti, Suraj Nair, Ashwin Balakrishna, Percy Liang, Thomas Kollar, Dorsa Sadigh

Visually-conditioned language models (VLMs) have seen growing adoption in applications such as visual dialogue, scene understanding, and robotic task planning; adoption that has fueled a wealth of new models such as LLaVa, InstructBLIP, and PaLI-3.

Hallucination Object Localization +3

213

Paper
Code

AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents

no code implementations • 23 Jan 2024 • Michael Ahn, Debidatta Dwibedi, Chelsea Finn, Montse Gonzalez Arenas, Keerthana Gopalakrishnan, Karol Hausman, Brian Ichter, Alex Irpan, Nikhil Joshi, Ryan Julian, Sean Kirmani, Edward Lee, Sergey Levine, Yao Lu, Isabel Leal, Sharath Maddineni, Kanishka Rao, Dorsa Sadigh, Pannag Sanketi, Pierre Sermanet, Quan Vuong, Stefan Welker, Fei Xia, Ted Xiao, Peng Xu, Steve Xu, Zhuo Xu

We experimentally show that such "in-the-wild" data collected by AutoRT is significantly more diverse, and that AutoRT's use of LLMs allows for instruction following data collection robots that can align to human preferences.

Instruction Following Scene Understanding

Paper
Add Code

SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities

no code implementations • 22 Jan 2024 • Boyuan Chen, Zhuo Xu, Sean Kirmani, Brian Ichter, Danny Driess, Pete Florence, Dorsa Sadigh, Leonidas Guibas, Fei Xia

By training a VLM on such data, we significantly enhance its ability on both qualitative and quantitative spatial VQA.

Question Answering Visual Question Answering

Paper
Add Code

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

no code implementations • 7 Dec 2023 • Chengshu Li, Jacky Liang, Andy Zeng, Xinyun Chen, Karol Hausman, Dorsa Sadigh, Sergey Levine, Li Fei-Fei, Fei Xia, Brian Ichter

For example, consider prompting an LM to write code that counts the number of times it detects sarcasm in an essay: the LM may struggle to write an implementation for "detect_sarcasm(string)" that can be executed by the interpreter (handling the edge cases would be insurmountable).

Language Modelling

Paper
Add Code

Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections

1 code implementation • 17 Nov 2023 • Lihan Zha, Yuchen Cui, Li-Heng Lin, Minae Kwon, Montserrat Gonzalez Arenas, Andy Zeng, Fei Xia, Dorsa Sadigh

DROC is able to respond to a sequence of online language corrections that address failures in both high-level task plans and low-level skill primitives.

Language Modelling Large Language Model +1

Paper
Code

Imitation Bootstrapped Reinforcement Learning

no code implementations • 3 Nov 2023 • Hengyuan Hu, Suvir Mirchandani, Dorsa Sadigh

Despite the considerable potential of reinforcement learning (RL), robotic control tasks predominantly rely on imitation learning (IL) due to its better sample efficiency.

Continuous Control Imitation Learning +2

Paper
Add Code

Diverse Conventions for Human-AI Collaboration

no code implementations • NeurIPS 2023 • Bidipta Sarkar, Andy Shih, Dorsa Sadigh

Conventions are crucial for strong performance in cooperative multi-agent games, because they allow players to coordinate on a shared strategy without explicit communication.

Multi-agent Reinforcement Learning

Paper
Add Code

Contrastive Preference Learning: Learning from Human Feedback without RL

1 code implementation • 20 Oct 2023 • Joey Hejna, Rafael Rafailov, Harshit Sikchi, Chelsea Finn, Scott Niekum, W. Bradley Knox, Dorsa Sadigh

Thus, learning a reward function from feedback is not only based on a flawed assumption of human preference, but also leads to unwieldy optimization challenges that stem from policy gradients or bootstrapping in the RL phase.

reinforcement-learning Reinforcement Learning (RL)

132

Paper
Code

Robots That Can See: Leveraging Human Pose for Trajectory Prediction

1 code implementation • 29 Sep 2023 • Tim Salzmann, Lewis Chiang, Markus Ryll, Dorsa Sadigh, Carolina Parada, Alex Bewley

Anticipating the motion of all humans in dynamic environments such as homes and offices is critical to enable safe and effective robot navigation.

Robot Navigation Trajectory Prediction

Paper
Code

Learning Sequential Acquisition Policies for Robot-Assisted Feeding

no code implementations • 11 Sep 2023 • Priya Sundaresan, Jiajun Wu, Dorsa Sadigh

A robot providing mealtime assistance must perform specialized maneuvers with various utensils in order to pick up and feed a range of food items.

Paper
Add Code

Physically Grounded Vision-Language Models for Robotic Manipulation

no code implementations • 5 Sep 2023 • Jensen Gao, Bidipta Sarkar, Fei Xia, Ted Xiao, Jiajun Wu, Brian Ichter, Anirudha Majumdar, Dorsa Sadigh

We incorporate this physically grounded VLM in an interactive framework with a large language model-based robotic planner, and show improved planning performance on tasks that require reasoning about physical object concepts, compared to baselines that do not leverage physically grounded VLMs.

Image Captioning Language Modelling +4

Paper
Add Code

Stabilize to Act: Learning to Coordinate for Bimanual Manipulation

no code implementations • 3 Sep 2023 • Jennifer Grannen, Yilin Wu, Brandon Vu, Dorsa Sadigh

We counteract this challenge by drawing inspiration from humans to propose a novel role assignment framework: a stabilizing arm holds an object in place to simplify the environment while an acting arm executes the task.

Paper
Add Code

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

no code implementations • 27 Jul 2023 • Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Wang, Samuel Marks, Charbel-Raphaël Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J. Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Biyik, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell

Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals.

reinforcement-learning

Paper
Add Code

Large Language Models as General Pattern Machines

no code implementations • 10 Jul 2023 • Suvir Mirchandani, Fei Xia, Pete Florence, Brian Ichter, Danny Driess, Montserrat Gonzalez Arenas, Kanishka Rao, Dorsa Sadigh, Andy Zeng

We observe that pre-trained large language models (LLMs) are capable of autoregressively completing complex token sequences -- from arbitrary ones procedurally generated by probabilistic context-free grammars (PCFG), to more rich spatial patterns found in the Abstraction and Reasoning Corpus (ARC), a general AI benchmark, prompted in the style of ASCII art.

In-Context Learning

Paper
Add Code

Polybot: Training One Policy Across Robots While Embracing Variability

no code implementations • 7 Jul 2023 • Jonathan Yang, Dorsa Sadigh, Chelsea Finn

Reusing large datasets is crucial to scale vision-based robotic manipulators to everyday scenarios due to the high cost of collecting robotic datasets.

Contrastive Learning

Paper
Add Code

Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners

no code implementations • 4 Jul 2023 • Allen Z. Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley, Zhenjia Xu, Dorsa Sadigh, Andy Zeng, Anirudha Majumdar

Large language models (LLMs) exhibit a wide range of promising capabilities -- from step-by-step planning to commonsense reasoning -- that may provide utility for robots, but remain prone to confidently hallucinated predictions.

Conformal Prediction Language Modelling +1

Paper
Add Code

KITE: Keypoint-Conditioned Policies for Semantic Manipulation

no code implementations • 29 Jun 2023 • Priya Sundaresan, Suneel Belkhale, Dorsa Sadigh, Jeannette Bohg

While natural language offers a convenient shared interface for humans and robots, enabling robots to interpret and follow language commands remains a longstanding challenge in manipulation.

Instruction Following Object

Paper
Add Code

Language to Rewards for Robotic Skill Synthesis

no code implementations • 14 Jun 2023 • Wenhao Yu, Nimrod Gileadi, Chuyuan Fu, Sean Kirmani, Kuang-Huei Lee, Montse Gonzalez Arenas, Hao-Tien Lewis Chiang, Tom Erez, Leonard Hasenclever, Jan Humplik, Brian Ichter, Ted Xiao, Peng Xu, Andy Zeng, Tingnan Zhang, Nicolas Heess, Dorsa Sadigh, Jie Tan, Yuval Tassa, Fei Xia

However, since low-level robot actions are hardware-dependent and underrepresented in LLM training corpora, existing efforts in applying LLMs to robotics have largely treated LLMs as semantic planners or relied on human-engineered control primitives to interface with the robot.

In-Context Learning Logical Reasoning

Paper
Add Code

Toward Grounded Commonsense Reasoning

no code implementations • 14 Jun 2023 • Minae Kwon, Hengyuan Hu, Vivek Myers, Siddharth Karamcheti, Anca Dragan, Dorsa Sadigh

We additionally illustrate our approach with a robot on 2 carefully designed surfaces.

Language Modelling

Paper
Add Code

Generating Language Corrections for Teaching Physical Control Tasks

1 code implementation • 12 Jun 2023 • Megha Srivastava, Noah Goodman, Dorsa Sadigh

AI assistance continues to help advance applications in education, from language learning to intelligent tutoring systems, yet current methods for providing students feedback are still quite limited.

valid

Paper
Code

Strategic Reasoning with Language Models

no code implementations • 30 May 2023 • Kanishk Gandhi, Dorsa Sadigh, Noah D. Goodman

Existing approaches to solving strategic games rely on extensive training, yielding strategies that do not generalize to new scenarios or games without retraining.

Paper
Add Code

Distance Weighted Supervised Learning for Offline Interaction Data

1 code implementation • 26 Apr 2023 • Joey Hejna, Jensen Gao, Dorsa Sadigh

To bridge the gap between IL and RL, we introduce Distance Weighted Supervised Learning or DWSL, a supervised method for learning goal-conditioned policies from offline data.

Imitation Learning Reinforcement Learning (RL)

Paper
Code

Behavior Retrieval: Few-Shot Imitation Learning by Querying Unlabeled Datasets

no code implementations • 18 Apr 2023 • Maximilian Du, Suraj Nair, Dorsa Sadigh, Chelsea Finn

Concretely, we propose a simple approach that uses a small amount of downstream expert data to selectively query relevant behaviors from an offline, unlabeled dataset (including many sub-optimal behaviors).

Few-Shot Imitation Learning Imitation Learning +2

Paper
Add Code

Language Instructed Reinforcement Learning for Human-AI Coordination

no code implementations • 13 Apr 2023 • Hengyuan Hu, Dorsa Sadigh

One of the fundamental quests of AI is to produce agents that coordinate well with humans.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Active Reward Learning from Online Preferences

no code implementations • 27 Feb 2023 • Vivek Myers, Erdem Biyik, Dorsa Sadigh

Robot policies need to adapt to human preferences and/or new environments.

Paper
Add Code

Reward Design with Language Models

1 code implementation • 27 Feb 2023 • Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh

During training, the LLM evaluates an RL agent's behavior against the desired behavior described by the prompt and outputs a corresponding reward signal.

Language Modelling Large Language Model +1

188

Paper
Code

Language-Driven Representation Learning for Robotics

2 code implementations • 24 Feb 2023 • Siddharth Karamcheti, Suraj Nair, Annie S. Chen, Thomas Kollar, Chelsea Finn, Dorsa Sadigh, Percy Liang

First, we demonstrate that existing representations yield inconsistent results across these tasks: masked autoencoding approaches pick up on low-level spatial features at the cost of high-level semantics, while contrastive learning approaches capture the opposite.

Contrastive Learning Imitation Learning +2

173

Paper
Code

Long Horizon Temperature Scaling

1 code implementation • 7 Feb 2023 • Andy Shih, Dorsa Sadigh, Stefano Ermon

LHTS is compatible with all likelihood-based models, and optimizes for the long horizon likelihood of samples.

Multiple-choice

Paper
Code

"No, to the Right" -- Online Language Corrections for Robotic Manipulation via Shared Autonomy

1 code implementation • 6 Jan 2023 • Yuchen Cui, Siddharth Karamcheti, Raj Palleti, Nidhya Shivakumar, Percy Liang, Dorsa Sadigh

Instead of discrete turn-taking between a human and robot, LILAC splits agency between the human and robot: language is an input to a learned model that produces a meaningful, low-dimensional control space that the human can use to guide the robot.

Instruction Following

Paper
Code

Few-Shot Preference Learning for Human-in-the-Loop RL

no code implementations • 6 Dec 2022 • Joey Hejna, Dorsa Sadigh

Contrary to most works that focus on query selection to \emph{minimize} the amount of data required for learning reward functions, we take an opposite approach: \emph{expanding} the pool of available data by viewing human-in-the-loop RL through the more flexible lens of multi-task learning.

Meta-Learning Multi-Task Learning +1

Paper
Add Code

Learning Visuo-Haptic Skewering Strategies for Robot-Assisted Feeding

no code implementations • 26 Nov 2022 • Priya Sundaresan, Suneel Belkhale, Dorsa Sadigh

Acquiring food items with a fork poses an immense challenge to a robot-assisted feeding system, due to the wide range of material properties and visual appearances present across food groups.

Paper
Add Code

Learning Bimanual Scooping Policies for Food Acquisition

no code implementations • 26 Nov 2022 • Jennifer Grannen, Yilin Wu, Suneel Belkhale, Dorsa Sadigh

In order to acquire foods with such diverse properties, we propose stabilizing food items during scooping using a second arm, for example, by pushing peas against the spoon with a flat surface to prevent dispersion.

Paper
Add Code

Assistive Teaching of Motor Control Tasks to Humans

1 code implementation • 25 Nov 2022 • Megha Srivastava, Erdem Biyik, Suvir Mirchandani, Noah Goodman, Dorsa Sadigh

In this paper, we focus on the problem of assistive teaching of motor control tasks such as parking a car or landing an aircraft.

Reinforcement Learning (RL)

Paper
Code

Learning Tool Morphology for Contact-Rich Manipulation Tasks with Differentiable Simulation

no code implementations • 4 Nov 2022 • Mengxi Li, Rika Antonova, Dorsa Sadigh, Jeannette Bohg

We demonstrate the effectiveness of our method for designing new tools in several scenarios, such as winding ropes, flipping a box and pushing peas onto a scoop in simulation.

Continual Learning

Paper
Add Code

Eliciting Compatible Demonstrations for Multi-Human Imitation Learning

no code implementations • 14 Oct 2022 • Kanishk Gandhi, Siddharth Karamcheti, Madeline Liao, Dorsa Sadigh

Imitation learning from human-provided demonstrations is a strong approach for learning policies for robot manipulation.

Imitation Learning Robot Manipulation

Paper
Add Code

Masked Imitation Learning: Discovering Environment-Invariant Modalities in Multimodal Demonstrations

no code implementations • 16 Sep 2022 • Yilun Hao, Ruinan Wang, Zhangjie Cao, Zihan Wang, Yuchen Cui, Dorsa Sadigh

Specifically, we design a masked policy network with a binary mask to block certain modalities.

Imitation Learning

Paper
Add Code

Training and Inference on Any-Order Autoregressive Models the Right Way

1 code implementation • 26 May 2022 • Andy Shih, Dorsa Sadigh, Stefano Ermon

Conditional inference on arbitrary subsets of variables is a core problem in probabilistic inference with important applications such as masked language modeling and image inpainting.

Image Inpainting Language Modelling +1

Paper
Code

Leveraging Smooth Attention Prior for Multi-Agent Trajectory Prediction

no code implementations • 8 Mar 2022 • Zhangjie Cao, Erdem Biyik, Guy Rosman, Dorsa Sadigh

At a certain time, to forecast a reasonable future trajectory, each agent needs to pay attention to the interactions with only a small group of most relevant agents instead of unnecessarily paying attention to all the other agents.

Trajectory Prediction

Paper
Add Code

Weakly Supervised Correspondence Learning

no code implementations • 2 Mar 2022 • Zihan Wang, Zhangjie Cao, Yilun Hao, Dorsa Sadigh

Correspondence learning is a fundamental problem in robotics, which aims to learn a mapping between state, action pairs of agents of different dynamics or embodiments.

Paper
Add Code

Learning from Imperfect Demonstrations via Adversarial Confidence Transfer

no code implementations • 7 Feb 2022 • Zhangjie Cao, Zihan Wang, Dorsa Sadigh

Existing learning from demonstration algorithms usually assume access to expert demonstrations.

Paper
Add Code

Imitation Learning by Estimating Expertise of Demonstrators

1 code implementation • 2 Feb 2022 • Mark Beliaev, Andy Shih, Stefano Ermon, Dorsa Sadigh, Ramtin Pedarsani

In this work, we show that unsupervised learning over demonstrator expertise can lead to a consistent boost in the performance of imitation learning algorithms.

Continuous Control Imitation Learning

Paper
Code

Conditional Imitation Learning for Multi-Agent Games

no code implementations • 5 Jan 2022 • Andy Shih, Stefano Ermon, Dorsa Sadigh

In this work, we study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time, and we must interact with and adapt to new partners at test time.

Imitation Learning Tensor Decomposition

Paper
Add Code

PantheonRL: A MARL Library for Dynamic Training Interactions

1 code implementation • 13 Dec 2021 • Bidipta Sarkar, Aditi Talati, Andy Shih, Dorsa Sadigh

We present PantheonRL, a multiagent reinforcement learning software package for dynamic training interactions such as round-robin, adaptive, and ad-hoc training.

reinforcement-learning Reinforcement Learning (RL)

117

Paper
Code

HyperSPNs: Compact and Expressive Probabilistic Circuits

1 code implementation • NeurIPS 2021 • Andy Shih, Dorsa Sadigh, Stefano Ermon

Probabilistic circuits (PCs) are a family of generative models which allows for the computation of exact likelihoods and marginals of its probability distributions.

Density Estimation

Paper
Code

LILA: Language-Informed Latent Actions

1 code implementation • 5 Nov 2021 • Siddharth Karamcheti, Megha Srivastava, Percy Liang, Dorsa Sadigh

We introduce Language-Informed Latent Actions (LILA), a framework for learning natural language interfaces in the context of human-robot collaboration.

Imitation Learning

Paper
Code

From Machine Learning to Robotics: Challenges and Opportunities for Embodied Intelligence

no code implementations • 28 Oct 2021 • Nicholas Roy, Ingmar Posner, Tim Barfoot, Philippe Beaudoin, Yoshua Bengio, Jeannette Bohg, Oliver Brock, Isabelle Depatie, Dieter Fox, Dan Koditschek, Tomas Lozano-Perez, Vikash Mansinghka, Christopher Pal, Blake Richards, Dorsa Sadigh, Stefan Schaal, Gaurav Sukhatme, Denis Therien, Marc Toussaint, Michiel Van de Panne

Machine learning has long since become a keystone technology, accelerating science and applications in a broad range of domains.

BIG-bench Machine Learning

Paper
Add Code

Learning Feasibility to Imitate Demonstrators with Different Dynamics

2 code implementations • 28 Oct 2021 • Zhangjie Cao, Yilun Hao, Mengxi Li, Dorsa Sadigh

The goal of learning from demonstrations is to learn a policy for an agent (imitator) by mimicking the behavior in the demonstrations.

Paper
Code

Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

2 code implementations • NeurIPS 2021 • Songyuan Zhang, Zhangjie Cao, Dorsa Sadigh, Yanan Sui

Our results show that CAIL significantly outperforms other imitation learning methods from demonstrations with varying optimality.

Imitation Learning

Paper
Code

Open-domain clarification question generation without question examples

no code implementations • EMNLP 2021 • Julia White, Gabriel Poesia, Robert Hawkins, Dorsa Sadigh, Noah Goodman

An overarching goal of natural language processing is to enable machines to communicate seamlessly with humans.

Question Generation Question-Generation

Paper
Add Code

Influencing Towards Stable Multi-Agent Interactions

no code implementations • 5 Oct 2021 • Woodrow Z. Wang, Andy Shih, Annie Xie, Dorsa Sadigh

Instead of reactively adapting to the other agent's (opponent or partner) behavior, we propose an algorithm to proactively influence the other agent's strategy to stabilize -- which can restrain the non-stationarity caused by the other agent.

Autonomous Driving

Paper
Add Code

Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams

no code implementations • 2 Oct 2021 • Erdem Biyik, Anusha Lalitha, Rajarshi Saha, Andrea Goldsmith, Dorsa Sadigh

Our results show that the proposed partner-aware strategy outperforms other known methods, and our human subject studies suggest humans prefer to collaborate with AI agents implementing our partner-aware strategy.

Decision Making

Paper
Add Code

Learning Reward Functions from Scale Feedback

1 code implementation • 1 Oct 2021 • Nils Wilde, Erdem Biyik, Dorsa Sadigh, Stephen L. Smith

Today's robots are increasingly interacting with people and need to efficiently learn inexperienced user's preferences.

Paper
Code

Learning Multimodal Rewards from Rankings

no code implementations • 27 Sep 2021 • Vivek Myers, Erdem Biyik, Nima Anari, Dorsa Sadigh

However, expert feedback is often assumed to be drawn from an underlying unimodal reward function.

Paper
Add Code

On the Opportunities and Risks of Foundation Models

2 code implementations • 16 Aug 2021 • Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

847

Paper
Code

APReL: A Library for Active Preference-based Reward Learning Algorithms

1 code implementation • 16 Aug 2021 • Erdem Biyik, Aditi Talati, Dorsa Sadigh

Reward learning is a fundamental problem in human-robot interaction to have robots that operate in alignment with what their human user wants.

Paper
Code

Targeted Data Acquisition for Evolving Negotiation Agents

no code implementations • 14 Jun 2021 • Minae Kwon, Siddharth Karamcheti, Mariano-Florentino Cuellar, Dorsa Sadigh

This trend additionally holds when comparing agents using our targeted data acquisition framework to variants of agents trained with a mix of supervised learning and reinforcement learning, or to agents using tailored reward functions that explicitly optimize for utility and Pareto-optimality.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Emergent Prosociality in Multi-Agent Games Through Gifting

no code implementations • 13 May 2021 • Woodrow Z. Wang, Mark Beliaev, Erdem Biyik, Daniel A. Lazar, Ramtin Pedarsani, Dorsa Sadigh

Coordination is often critical to forming prosocial behaviors -- behaviors that increase the overall sum of rewards received by all agents in a multi-agent game.

Paper
Add Code

Incentivizing Efficient Equilibria in Traffic Networks with Mixed Autonomy

no code implementations • 6 May 2021 • Erdem Biyik, Daniel A. Lazar, Ramtin Pedarsani, Dorsa Sadigh

Traffic congestion has large economic and social costs.

Autonomous Vehicles

Paper
Add Code

Learning Visually Guided Latent Actions for Assistive Teleoperation

1 code implementation • 2 May 2021 • Siddharth Karamcheti, Albert J. Zhai, Dylan P. Losey, Dorsa Sadigh

In this work, we develop assistive robots that condition their latent embeddings on visual inputs.

Paper
Code

On the Critical Role of Conventions in Adaptive Human-AI Collaboration

1 code implementation • ICLR 2021 • Andy Shih, Arjun Sawhney, Jovana Kondic, Stefano Ermon, Dorsa Sadigh

Humans can quickly adapt to new partners in collaborative tasks (e. g. playing basketball), because they understand which fundamental skills of the task (e. g. how to dribble, how to shoot) carry over across new partners.

Paper
Code

ELLA: Exploration through Learned Language Abstraction

1 code implementation • NeurIPS 2021 • Suvir Mirchandani, Siddharth Karamcheti, Dorsa Sadigh

Building agents capable of understanding language instructions is critical to effective and robust human-AI collaboration.

Paper
Code

Learning from Imperfect Demonstrations from Agents with Varying Dynamics

1 code implementation • 10 Mar 2021 • Zhangjie Cao, Dorsa Sadigh

The proposed score enables learning from more informative demonstrations, and disregarding the less relevant demonstrations.

Imitation Learning

Paper
Code

Transfer Reinforcement Learning across Homotopy Classes

no code implementations • 10 Feb 2021 • Zhangjie Cao, Minae Kwon, Dorsa Sadigh

The ability for robots to transfer their learned knowledge to new tasks -- where data is scarce -- is a fundamental challenge for successful robot learning.

Transfer Reinforcement Learning Robotics

Paper
Add Code

Incentivizing Routing Choices for Safe and Efficient Transportation in the Face of the COVID-19 Pandemic

no code implementations • 28 Dec 2020 • Mark Beliaev, Erdem Biyik, Daniel A. Lazar, Woodrow Z. Wang, Dorsa Sadigh, Ramtin Pedarsani

In turn, significant increases in traffic congestion are expected, since people are likely to prefer using their own vehicles or taxis as opposed to riskier and more crowded options such as the railway.

Paper
Add Code

Learning Latent Representations to Influence Multi-Agent Interaction

no code implementations • 12 Nov 2020 • Annie Xie, Dylan P. Losey, Ryan Tolsma, Chelsea Finn, Dorsa Sadigh

We propose a reinforcement learning-based framework for learning latent representations of an agent's policy, where the ego agent identifies the relationship between its behavior and the other agent's future strategy.

Paper
Add Code

ROIAL: Region of Interest Active Learning for Characterizing Exoskeleton Gait Preference Landscapes

1 code implementation • 9 Nov 2020 • Kejun Li, Maegan Tucker, Erdem Biyik, Ellen Novoseller, Joel W. Burdick, Yanan Sui, Dorsa Sadigh, Yisong Yue, Aaron D. Ames

ROIAL learns Bayesian posteriors that predict each exoskeleton user's utility landscape across four exoskeleton gait parameters.

Active Learning

Paper
Code

Learning Adaptive Language Interfaces through Decomposition

no code implementations • EMNLP (intexsempar) 2020 • Siddharth Karamcheti, Dorsa Sadigh, Percy Liang

Our goal is to create an interactive natural language interface that efficiently and reliably learns from users to complete tasks in simulated robotics settings.

Semantic Parsing

Paper
Add Code

Multi-Agent Safe Planning with Gaussian Processes

no code implementations • 10 Aug 2020 • Zheqing Zhu, Erdem Biyik, Dorsa Sadigh

Multi-agent safe systems have become an increasingly important area of study as we can now easily have multiple AI-powered systems operating together.

Gaussian Processes

Paper
Add Code

Learning User-Preferred Mappings for Intuitive Robot Control

no code implementations • 22 Jul 2020 • Mengxi Li, Dylan P. Losey, Jeannette Bohg, Dorsa Sadigh

Existing approaches to teleoperation typically assume a one-size-fits-all approach, where the designers pre-define a mapping between human inputs and robot actions, and every user must adapt to this mapping over repeated interactions.

Robot Manipulation

Paper
Add Code

Reinforcement Learning based Control of Imitative Policies for Near-Accident Driving

1 code implementation • 1 Jul 2020 • Zhangjie Cao, Erdem Biyik, Woodrow Z. Wang, Allan Raventos, Adrien Gaidon, Guy Rosman, Dorsa Sadigh

To address driving in near-accident scenarios, we propose a hierarchical reinforcement and imitation learning (H-ReIL) approach that consists of low-level policies learned by IL for discrete driving modes, and a high-level policy learned by RL that switches between different driving modes.

Autonomous Driving Imitation Learning +2

Paper
Code

Learning Reward Functions from Diverse Sources of Human Feedback: Optimally Integrating Demonstrations and Preferences

no code implementations • 24 Jun 2020 • Erdem Biyik, Dylan P. Losey, Malayandi Palan, Nicholas C. Landolfi, Gleb Shevchuk, Dorsa Sadigh

As designing reward functions can be extremely challenging, a more promising approach is to directly learn reward functions from human teachers.

Paper
Add Code

Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal Constraints

1 code implementation • 27 May 2020 • Shushman Choudhury, Jayesh K. Gupta, Mykel J. Kochenderfer, Dorsa Sadigh, Jeannette Bohg

We consider the problem of dynamically allocating tasks to multiple agents under time window constraints and task completion uncertainty.

Decision Making Decision Making Under Uncertainty +1

Paper
Code

Active Preference-Based Gaussian Process Regression for Reward Learning

1 code implementation • 6 May 2020 • Erdem Biyik, Nicolas Huynh, Mykel J. Kochenderfer, Dorsa Sadigh

Our results in simulations and a user study suggest that our approach can efficiently learn expressive reward functions for robotics tasks.

regression

Paper
Code

BLEU Neighbors: A Reference-less Approach to Automatic Evaluation

no code implementations • EMNLP (Eval4NLP) 2020 • Kawin Ethayarajh, Dorsa Sadigh

To this end, we propose BLEU Neighbors, a nearest neighbors model for estimating language quality by using the BLEU score as a kernel function.

Machine Translation Sentence +2

Paper
Add Code

Exchangeable Input Representations for Reinforcement Learning

no code implementations • 19 Mar 2020 • John Mern, Dorsa Sadigh, Mykel J. Kochenderfer

We show that our proposed representation results in an input space that is a factor of $m!$ smaller for inputs of $m$ objects.

Policy Gradient Methods reinforcement-learning +1

Paper
Add Code

When Humans Aren't Optimal: Robots that Collaborate with Risk-Aware Humans

no code implementations • 13 Jan 2020 • Minae Kwon, Erdem Biyik, Aditi Talati, Karan Bhasin, Dylan P. Losey, Dorsa Sadigh

Overall, we extend existing rational human models so that collaborative robots can anticipate and plan around suboptimal human behavior during HRI.

Paper
Add Code

Continual adaptation for efficient machine communication

1 code implementation • CONLL 2020 • Robert D. Hawkins, Minae Kwon, Dorsa Sadigh, Noah D. Goodman

To communicate with new partners in new contexts, humans rapidly form new linguistic conventions.

Continual Learning Language Modelling

Paper
Code

Learning from My Partner's Actions: Roles in Decentralized Robot Teams

no code implementations • 16 Oct 2019 • Dylan P. Losey, Mengxi Li, Jeannette Bohg, Dorsa Sadigh

When teams of robots collaborate to complete a task, communication is often necessary.

Paper
Add Code

Asking Easy Questions: A User-Friendly Approach to Active Reward Learning

2 code implementations • 10 Oct 2019 • Erdem Biyik, Malayandi Palan, Nicholas C. Landolfi, Dylan P. Losey, Dorsa Sadigh

Robots can learn the right reward function by querying a human expert.

Paper
Code

Controlling Assistive Robots with Learned Latent Actions

no code implementations • 20 Sep 2019 • Dylan P. Losey, Krishnan Srinivasan, Ajay Mandlekar, Animesh Garg, Dorsa Sadigh

Our insight is that we can make assistive robots easier for humans to control by leveraging latent actions.

Robotics

Paper
Add Code

Learning Reward Functions by Integrating Human Demonstrations and Preferences

1 code implementation • 21 Jun 2019 • Malayandi Palan, Nicholas C. Landolfi, Gleb Shevchuk, Dorsa Sadigh

In a user study, we compare our method to a standard IRL method; we find that users rated the robot trained with DemPref as being more successful at learning their desired behavior, and preferred to use the DemPref system (over IRL) to train the robot.

Paper
Code

Batch Active Learning Using Determinantal Point Processes

1 code implementation • 19 Jun 2019 • Erdem Biyik, Kenneth Wang, Nima Anari, Dorsa Sadigh

While active learning methods attempt to tackle this issue by labeling only the data samples that give high information, they generally suffer from large computational costs and are impractical in settings where data can be collected in parallel.

Active Learning Point Processes

Paper
Code

Deep Local Trajectory Replanning and Control for Robot Navigation

no code implementations • 13 May 2019 • Ashwini Pokle, Roberto Martín-Martín, Patrick Goebel, Vincent Chow, Hans M. Ewald, Junwei Yang, Zhenkai Wang, Amir Sadeghian, Dorsa Sadigh, Silvio Savarese, Marynel Vázquez

We present a navigation system that combines ideas from hierarchical planning and machine learning.

Robot Navigation

Paper
Add Code

Object Exchangeability in Reinforcement Learning: Extended Abstract

no code implementations • 7 May 2019 • John Mern, Dorsa Sadigh, Mykel Kochenderfer

Although deep reinforcement learning has advanced significantly over the past several years, sample efficiency remains a major challenge.

Object Policy Gradient Methods +2

Paper
Add Code

Efficient and Safe Exploration in Deterministic Markov Decision Processes with Unknown Transition Models

no code implementations • 1 Apr 2019 • Erdem Biyik, Jonathan Margoliash, Shahrouz Ryan Alimo, Dorsa Sadigh

We propose a safe exploration algorithm for deterministic Markov Decision Processes with unknown transition models.

Safe Exploration

Paper
Add Code

Unsupervised Visuomotor Control through Distributional Planning Networks

1 code implementation • 14 Feb 2019 • Tianhe Yu, Gleb Shevchuk, Dorsa Sadigh, Chelsea Finn

While reinforcement learning (RL) has the potential to enable robots to autonomously acquire a wide range of skills, in practice, RL usually requires manual, per-task engineering of reward functions, especially in real world settings where aspects of the environment needed to compute progress are not directly accessible.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Hierarchical Game-Theoretic Planning for Autonomous Vehicles

no code implementations • 13 Oct 2018 • Jaime F. Fisac, Eli Bronstein, Elis Stefansson, Dorsa Sadigh, S. Shankar Sastry, Anca D. Dragan

This mutual dependence, best captured by dynamic game theory, creates a strong coupling between the vehicle's planning and its predictions of other drivers' behavior, and constitutes an open problem with direct implications on the safety and viability of autonomous driving technology.

Autonomous Driving Decision Making +1

Paper
Add Code

Batch Active Preference-Based Learning of Reward Functions

1 code implementation • 10 Oct 2018 • Erdem Biyik, Dorsa Sadigh

Data generation and labeling are usually an expensive part of learning for robotics.

Active Learning

Paper
Code

Multi-Agent Generative Adversarial Imitation Learning

1 code implementation • NeurIPS 2018 • Jiaming Song, Hongyu Ren, Dorsa Sadigh, Stefano Ermon

Imitation learning algorithms can be used to learn a policy from expert demonstrations without access to a reward signal.

Imitation Learning reinforcement-learning +1

Paper
Code

Towards Verified Artificial Intelligence

no code implementations • 27 Jun 2016 • Sanjit A. Seshia, Dorsa Sadigh, S. Shankar Sastry

Verified artificial intelligence (AI) is the goal of designing AI-based systems that that have strong, ideally provable, assurances of correctness with respect to mathematically-specified requirements.

Paper
Add Code

Safe Control under Uncertainty

no code implementations • 25 Oct 2015 • Dorsa Sadigh, Ashish Kapoor

In this paper, we propose a new logic, Probabilistic Signal Temporal Logic (PrSTL), as an expressive language to define the stochastic properties, and enforce probabilistic guarantees on them.

Autonomous Vehicles

Paper
Add Code

Robust Subspace System Identification via Weighted Nuclear Norm Optimization

no code implementations • 7 Dec 2013 • Dorsa Sadigh, Henrik Ohlsson, S. Shankar Sastry, Sanjit A. Seshia

As in robust PCA, it can be problematic to find a suitable regularization parameter.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.