Search Results for author: Dorsa Sadigh

Found 109 papers, 41 papers with code

Vision Language Models are In-Context Value Learners

no code implementations7 Nov 2024 Yecheng Jason Ma, Joey Hejna, Ayzaan Wahid, Chuyuan Fu, Dhruv Shah, Jacky Liang, Zhuo Xu, Sean Kirmani, Peng Xu, Danny Driess, Ted Xiao, Jonathan Tompson, Osbert Bastani, Dinesh Jayaraman, Wenhao Yu, Tingnan Zhang, Dorsa Sadigh, Fei Xia

Instead, GVL poses value estimation as a temporal ordering problem over shuffled video frames; this seemingly more challenging task encourages VLMs to more fully exploit their underlying semantic and temporal grounding capabilities to differentiate frames based on their perceived task progress, consequently producing significantly better value predictions.

In-Context Learning World Knowledge

RT-Affordance: Affordances are Versatile Intermediate Representations for Robot Manipulation

no code implementations5 Nov 2024 Soroush Nasiriany, Sean Kirmani, Tianli Ding, Laura Smith, Yuke Zhu, Danny Driess, Dorsa Sadigh, Ted Xiao

Our method, RT-Affordance, is a hierarchical model that first proposes an affordance plan given the task language, and then conditions the policy on this affordance plan to perform manipulation.

Robot Manipulation

Vocal Sandbox: Continual Learning and Adaptation for Situated Human-Robot Collaboration

no code implementations4 Nov 2024 Jennifer Grannen, Siddharth Karamcheti, Suvir Mirchandani, Percy Liang, Dorsa Sadigh

Similarly, users teach high-level planning behaviors through spoken dialogue, using pretrained language models to synthesize behaviors such as "packing an object away" as compositions of low-level skills $-$ concepts that can be reused and built upon.

Continual Learning

So You Think You Can Scale Up Autonomous Robot Data Collection?

no code implementations4 Nov 2024 Suvir Mirchandani, Suneel Belkhale, Joey Hejna, Evelyn Choi, Md Sazzad Islam, Dorsa Sadigh

Our work suggests a negative result: that scaling up autonomous data collection for learning robot policies for real-world tasks is more challenging and impractical than what is suggested in prior work.

Imitation Learning Reinforcement Learning (RL)

MotIF: Motion Instruction Fine-tuning

no code implementations16 Sep 2024 Minyoung Hwang, Joey Hejna, Dorsa Sadigh, Yonatan Bisk

MotIF assesses the success of robot motion given the image observation of the trajectory, task instruction, and motion description.

FlowRetrieval: Flow-Guided Data Retrieval for Few-Shot Imitation Learning

no code implementations29 Aug 2024 Li-Heng Lin, Yuchen Cui, Amber Xie, Tianyu Hua, Dorsa Sadigh

We propose FlowRetrieval, an approach that leverages optical flow representations for both extracting similar motions to target tasks from prior data, and for guiding learning of a policy that can maximally benefit from such data.

Few-Shot Imitation Learning Imitation Learning +4

Re-Mix: Optimizing Data Mixtures for Large Scale Imitation Learning

1 code implementation26 Aug 2024 Joey Hejna, Chethan Bhateja, Yichen Jian, Karl Pertsch, Dorsa Sadigh

Increasingly large imitation learning datasets are being collected with the goal of training foundation models for robotics.

Imitation Learning Robot Manipulation

FLAIR: Feeding via Long-horizon AcquIsition of Realistic dishes

no code implementations10 Jul 2024 Rajat Kumar Jenamani, Priya Sundaresan, Maram Sakr, Tapomayukh Bhattacharjee, Dorsa Sadigh

We address this with FLAIR, a system for long-horizon feeding which leverages the commonsense and few-shot reasoning capabilities of foundation models, along with a library of parameterized skills, to plan and execute user-preferred and efficient bite sequences.

OpenVLA: An Open-Source Vision-Language-Action Model

1 code implementation13 Jun 2024 Moo Jin Kim, Karl Pertsch, Siddharth Karamcheti, Ted Xiao, Ashwin Balakrishna, Suraj Nair, Rafael Rafailov, Ethan Foster, Grace Lam, Pannag Sanketi, Quan Vuong, Thomas Kollar, Benjamin Burchfiel, Russ Tedrake, Dorsa Sadigh, Sergey Levine, Percy Liang, Chelsea Finn

Large policies pretrained on a combination of Internet-scale vision-language data and diverse robot demonstrations have the potential to change how we teach robots new skills: rather than training new behaviors from scratch, we can fine-tune such vision-language-action (VLA) models to obtain robust, generalizable policies for visuomotor control.

Imitation Learning Language Modelling +1

Octo: An Open-Source Generalist Robot Policy

no code implementations20 May 2024 Octo Model Team, Dibya Ghosh, Homer Walke, Karl Pertsch, Kevin Black, Oier Mees, Sudeep Dasari, Joey Hejna, Tobias Kreiman, Charles Xu, Jianlan Luo, You Liang Tan, Lawrence Yunliang Chen, Pannag Sanketi, Quan Vuong, Ted Xiao, Dorsa Sadigh, Chelsea Finn, Sergey Levine

In experiments across 9 robotic platforms, we demonstrate that Octo serves as a versatile policy initialization that can be effectively finetuned to new observation and action spaces.

Robot Manipulation

Policy Learning with a Language Bottleneck

1 code implementation7 May 2024 Megha Srivastava, Cedric Colas, Dorsa Sadigh, Jacob Andreas

Modern AI systems such as self-driving cars and game-playing agents achieve superhuman performance, but often lack human-like features such as generalization, interpretability and human inter-operability.

Decision Making Image Reconstruction +1

Explore until Confident: Efficient Exploration for Embodied Question Answering

no code implementations23 Mar 2024 Allen Z. Ren, Jaden Clark, Anushri Dixit, Masha Itkina, Anirudha Majumdar, Dorsa Sadigh

We consider the problem of Embodied Question Answering (EQA), which refers to settings where an embodied agent such as a robot needs to actively explore an environment to gather information until it is confident about the answer to a question.

Conformal Prediction Efficient Exploration +3

Efficient Data Collection for Robotic Manipulation via Compositional Generalization

no code implementations8 Mar 2024 Jensen Gao, Annie Xie, Ted Xiao, Chelsea Finn, Dorsa Sadigh

If robot policies can compose environmental factors from their data to succeed when encountering unseen factor combinations, we can exploit this to avoid collecting data for situations that composition would address.

Imitation Learning

RT-H: Action Hierarchies Using Language

no code implementations4 Mar 2024 Suneel Belkhale, Tianli Ding, Ted Xiao, Pierre Sermanet, Quon Vuong, Jonathan Tompson, Yevgen Chebotar, Debidatta Dwibedi, Dorsa Sadigh

Predicting these language motions as an intermediate step between tasks and actions forces the policy to learn the shared structure of low-level motions across seemingly disparate tasks.

Imitation Learning

Batch Active Learning of Reward Functions from Human Preferences

no code implementations24 Feb 2024 Erdem Biyik, Nima Anari, Dorsa Sadigh

Our results suggest that our batch active learning algorithm requires only a few queries that are computed in a short amount of time.

Active Learning Point Processes

Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models

3 code implementations12 Feb 2024 Siddharth Karamcheti, Suraj Nair, Ashwin Balakrishna, Percy Liang, Thomas Kollar, Dorsa Sadigh

Visually-conditioned language models (VLMs) have seen growing adoption in applications such as visual dialogue, scene understanding, and robotic task planning; adoption that has fueled a wealth of new models such as LLaVa, InstructBLIP, and PaLI-3.

Hallucination Object Localization +3

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

1 code implementation7 Dec 2023 Chengshu Li, Jacky Liang, Andy Zeng, Xinyun Chen, Karol Hausman, Dorsa Sadigh, Sergey Levine, Li Fei-Fei, Fei Xia, Brian Ichter

For example, consider prompting an LM to write code that counts the number of times it detects sarcasm in an essay: the LM may struggle to write an implementation for "detect_sarcasm(string)" that can be executed by the interpreter (handling the edge cases would be insurmountable).

Language Modelling

Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections

1 code implementation17 Nov 2023 Lihan Zha, Yuchen Cui, Li-Heng Lin, Minae Kwon, Montserrat Gonzalez Arenas, Andy Zeng, Fei Xia, Dorsa Sadigh

DROC is able to respond to a sequence of online language corrections that address failures in both high-level task plans and low-level skill primitives.

Language Modelling Large Language Model +1

Imitation Bootstrapped Reinforcement Learning

no code implementations3 Nov 2023 Hengyuan Hu, Suvir Mirchandani, Dorsa Sadigh

Despite the considerable potential of reinforcement learning (RL), robotic control tasks predominantly rely on imitation learning (IL) due to its better sample efficiency.

Continuous Control Imitation Learning +3

Diverse Conventions for Human-AI Collaboration

no code implementations NeurIPS 2023 Bidipta Sarkar, Andy Shih, Dorsa Sadigh

Conventions are crucial for strong performance in cooperative multi-agent games, because they allow players to coordinate on a shared strategy without explicit communication.

Multi-agent Reinforcement Learning

Contrastive Preference Learning: Learning from Human Feedback without RL

1 code implementation20 Oct 2023 Joey Hejna, Rafael Rafailov, Harshit Sikchi, Chelsea Finn, Scott Niekum, W. Bradley Knox, Dorsa Sadigh

Thus, learning a reward function from feedback is not only based on a flawed assumption of human preference, but also leads to unwieldy optimization challenges that stem from policy gradients or bootstrapping in the RL phase.

reinforcement-learning Reinforcement Learning (RL)

Robots That Can See: Leveraging Human Pose for Trajectory Prediction

1 code implementation29 Sep 2023 Tim Salzmann, Lewis Chiang, Markus Ryll, Dorsa Sadigh, Carolina Parada, Alex Bewley

Anticipating the motion of all humans in dynamic environments such as homes and offices is critical to enable safe and effective robot navigation.

Robot Navigation Trajectory Prediction

Learning Sequential Acquisition Policies for Robot-Assisted Feeding

no code implementations11 Sep 2023 Priya Sundaresan, Jiajun Wu, Dorsa Sadigh

A robot providing mealtime assistance must perform specialized maneuvers with various utensils in order to pick up and feed a range of food items.

Physically Grounded Vision-Language Models for Robotic Manipulation

no code implementations5 Sep 2023 Jensen Gao, Bidipta Sarkar, Fei Xia, Ted Xiao, Jiajun Wu, Brian Ichter, Anirudha Majumdar, Dorsa Sadigh

We incorporate this physically grounded VLM in an interactive framework with a large language model-based robotic planner, and show improved planning performance on tasks that require reasoning about physical object concepts, compared to baselines that do not leverage physically grounded VLMs.

Image Captioning Language Modelling +4

Stabilize to Act: Learning to Coordinate for Bimanual Manipulation

no code implementations3 Sep 2023 Jennifer Grannen, Yilin Wu, Brandon Vu, Dorsa Sadigh

We counteract this challenge by drawing inspiration from humans to propose a novel role assignment framework: a stabilizing arm holds an object in place to simplify the environment while an acting arm executes the task.

Large Language Models as General Pattern Machines

no code implementations10 Jul 2023 Suvir Mirchandani, Fei Xia, Pete Florence, Brian Ichter, Danny Driess, Montserrat Gonzalez Arenas, Kanishka Rao, Dorsa Sadigh, Andy Zeng

We observe that pre-trained large language models (LLMs) are capable of autoregressively completing complex token sequences -- from arbitrary ones procedurally generated by probabilistic context-free grammars (PCFG), to more rich spatial patterns found in the Abstraction and Reasoning Corpus (ARC), a general AI benchmark, prompted in the style of ASCII art.

ARC In-Context Learning

Polybot: Training One Policy Across Robots While Embracing Variability

no code implementations7 Jul 2023 Jonathan Yang, Dorsa Sadigh, Chelsea Finn

Reusing large datasets is crucial to scale vision-based robotic manipulators to everyday scenarios due to the high cost of collecting robotic datasets.

Contrastive Learning

Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners

no code implementations4 Jul 2023 Allen Z. Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley, Zhenjia Xu, Dorsa Sadigh, Andy Zeng, Anirudha Majumdar

Large language models (LLMs) exhibit a wide range of promising capabilities -- from step-by-step planning to commonsense reasoning -- that may provide utility for robots, but remain prone to confidently hallucinated predictions.

Conformal Prediction Language Modelling +1

KITE: Keypoint-Conditioned Policies for Semantic Manipulation

no code implementations29 Jun 2023 Priya Sundaresan, Suneel Belkhale, Dorsa Sadigh, Jeannette Bohg

While natural language offers a convenient shared interface for humans and robots, enabling robots to interpret and follow language commands remains a longstanding challenge in manipulation.

Instruction Following Object

Toward Grounded Commonsense Reasoning

no code implementations14 Jun 2023 Minae Kwon, Hengyuan Hu, Vivek Myers, Siddharth Karamcheti, Anca Dragan, Dorsa Sadigh

We additionally illustrate our approach with a robot on 2 carefully designed surfaces.

Language Modelling

Language to Rewards for Robotic Skill Synthesis

no code implementations14 Jun 2023 Wenhao Yu, Nimrod Gileadi, Chuyuan Fu, Sean Kirmani, Kuang-Huei Lee, Montse Gonzalez Arenas, Hao-Tien Lewis Chiang, Tom Erez, Leonard Hasenclever, Jan Humplik, Brian Ichter, Ted Xiao, Peng Xu, Andy Zeng, Tingnan Zhang, Nicolas Heess, Dorsa Sadigh, Jie Tan, Yuval Tassa, Fei Xia

However, since low-level robot actions are hardware-dependent and underrepresented in LLM training corpora, existing efforts in applying LLMs to robotics have largely treated LLMs as semantic planners or relied on human-engineered control primitives to interface with the robot.

In-Context Learning Logical Reasoning

Generating Language Corrections for Teaching Physical Control Tasks

1 code implementation12 Jun 2023 Megha Srivastava, Noah Goodman, Dorsa Sadigh

AI assistance continues to help advance applications in education, from language learning to intelligent tutoring systems, yet current methods for providing students feedback are still quite limited.

Diversity valid

Strategic Reasoning with Language Models

no code implementations30 May 2023 Kanishk Gandhi, Dorsa Sadigh, Noah D. Goodman

Existing approaches to solving strategic games rely on extensive training, yielding strategies that do not generalize to new scenarios or games without retraining.

Parallel Sampling of Diffusion Models

1 code implementation NeurIPS 2023 Andy Shih, Suneel Belkhale, Stefano Ermon, Dorsa Sadigh, Nima Anari

Instead of reducing the number of denoising steps (trading quality for speed), in this paper we explore an orthogonal approach: can we run the denoising steps in parallel (trading compute for speed)?

Denoising Image Generation

Distance Weighted Supervised Learning for Offline Interaction Data

1 code implementation26 Apr 2023 Joey Hejna, Jensen Gao, Dorsa Sadigh

To bridge the gap between IL and RL, we introduce Distance Weighted Supervised Learning or DWSL, a supervised method for learning goal-conditioned policies from offline data.

Imitation Learning Reinforcement Learning (RL) +1

Behavior Retrieval: Few-Shot Imitation Learning by Querying Unlabeled Datasets

no code implementations18 Apr 2023 Maximilian Du, Suraj Nair, Dorsa Sadigh, Chelsea Finn

Concretely, we propose a simple approach that uses a small amount of downstream expert data to selectively query relevant behaviors from an offline, unlabeled dataset (including many sub-optimal behaviors).

Few-Shot Imitation Learning Imitation Learning +2

Active Reward Learning from Online Preferences

no code implementations27 Feb 2023 Vivek Myers, Erdem Biyik, Dorsa Sadigh

Robot policies need to adapt to human preferences and/or new environments.

Reward Design with Language Models

1 code implementation27 Feb 2023 Minae Kwon, Sang Michael Xie, Kalesha Bullard, Dorsa Sadigh

During training, the LLM evaluates an RL agent's behavior against the desired behavior described by the prompt and outputs a corresponding reward signal.

Language Modelling Large Language Model +1

Language-Driven Representation Learning for Robotics

2 code implementations24 Feb 2023 Siddharth Karamcheti, Suraj Nair, Annie S. Chen, Thomas Kollar, Chelsea Finn, Dorsa Sadigh, Percy Liang

First, we demonstrate that existing representations yield inconsistent results across these tasks: masked autoencoding approaches pick up on low-level spatial features at the cost of high-level semantics, while contrastive learning approaches capture the opposite.

Contrastive Learning Imitation Learning +2

Long Horizon Temperature Scaling

1 code implementation7 Feb 2023 Andy Shih, Dorsa Sadigh, Stefano Ermon

LHTS is compatible with all likelihood-based models, and optimizes for the long horizon likelihood of samples.

Multiple-choice

"No, to the Right" -- Online Language Corrections for Robotic Manipulation via Shared Autonomy

1 code implementation6 Jan 2023 Yuchen Cui, Siddharth Karamcheti, Raj Palleti, Nidhya Shivakumar, Percy Liang, Dorsa Sadigh

Instead of discrete turn-taking between a human and robot, LILAC splits agency between the human and robot: language is an input to a learned model that produces a meaningful, low-dimensional control space that the human can use to guide the robot.

Instruction Following

Few-Shot Preference Learning for Human-in-the-Loop RL

no code implementations6 Dec 2022 Joey Hejna, Dorsa Sadigh

Contrary to most works that focus on query selection to \emph{minimize} the amount of data required for learning reward functions, we take an opposite approach: \emph{expanding} the pool of available data by viewing human-in-the-loop RL through the more flexible lens of multi-task learning.

Meta-Learning Multi-Task Learning +1

Learning Visuo-Haptic Skewering Strategies for Robot-Assisted Feeding

no code implementations26 Nov 2022 Priya Sundaresan, Suneel Belkhale, Dorsa Sadigh

Acquiring food items with a fork poses an immense challenge to a robot-assisted feeding system, due to the wide range of material properties and visual appearances present across food groups.

Diversity

Learning Bimanual Scooping Policies for Food Acquisition

no code implementations26 Nov 2022 Jennifer Grannen, Yilin Wu, Suneel Belkhale, Dorsa Sadigh

In order to acquire foods with such diverse properties, we propose stabilizing food items during scooping using a second arm, for example, by pushing peas against the spoon with a flat surface to prevent dispersion.

Assistive Teaching of Motor Control Tasks to Humans

1 code implementation25 Nov 2022 Megha Srivastava, Erdem Biyik, Suvir Mirchandani, Noah Goodman, Dorsa Sadigh

In this paper, we focus on the problem of assistive teaching of motor control tasks such as parking a car or landing an aircraft.

Reinforcement Learning (RL)

Learning Tool Morphology for Contact-Rich Manipulation Tasks with Differentiable Simulation

no code implementations4 Nov 2022 Mengxi Li, Rika Antonova, Dorsa Sadigh, Jeannette Bohg

We demonstrate the effectiveness of our method for designing new tools in several scenarios, such as winding ropes, flipping a box and pushing peas onto a scoop in simulation.

Continual Learning

Eliciting Compatible Demonstrations for Multi-Human Imitation Learning

no code implementations14 Oct 2022 Kanishk Gandhi, Siddharth Karamcheti, Madeline Liao, Dorsa Sadigh

Imitation learning from human-provided demonstrations is a strong approach for learning policies for robot manipulation.

Imitation Learning Robot Manipulation

Training and Inference on Any-Order Autoregressive Models the Right Way

1 code implementation26 May 2022 Andy Shih, Dorsa Sadigh, Stefano Ermon

Conditional inference on arbitrary subsets of variables is a core problem in probabilistic inference with important applications such as masked language modeling and image inpainting.

Image Inpainting Language Modelling +1

Leveraging Smooth Attention Prior for Multi-Agent Trajectory Prediction

no code implementations8 Mar 2022 Zhangjie Cao, Erdem Biyik, Guy Rosman, Dorsa Sadigh

At a certain time, to forecast a reasonable future trajectory, each agent needs to pay attention to the interactions with only a small group of most relevant agents instead of unnecessarily paying attention to all the other agents.

Trajectory Prediction

Weakly Supervised Correspondence Learning

no code implementations2 Mar 2022 Zihan Wang, Zhangjie Cao, Yilun Hao, Dorsa Sadigh

Correspondence learning is a fundamental problem in robotics, which aims to learn a mapping between state, action pairs of agents of different dynamics or embodiments.

Learning from Imperfect Demonstrations via Adversarial Confidence Transfer

no code implementations7 Feb 2022 Zhangjie Cao, Zihan Wang, Dorsa Sadigh

Existing learning from demonstration algorithms usually assume access to expert demonstrations.

Imitation Learning by Estimating Expertise of Demonstrators

1 code implementation2 Feb 2022 Mark Beliaev, Andy Shih, Stefano Ermon, Dorsa Sadigh, Ramtin Pedarsani

In this work, we show that unsupervised learning over demonstrator expertise can lead to a consistent boost in the performance of imitation learning algorithms.

continuous-control Continuous Control +1

Conditional Imitation Learning for Multi-Agent Games

no code implementations5 Jan 2022 Andy Shih, Stefano Ermon, Dorsa Sadigh

In this work, we study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time, and we must interact with and adapt to new partners at test time.

Imitation Learning Tensor Decomposition

PantheonRL: A MARL Library for Dynamic Training Interactions

1 code implementation13 Dec 2021 Bidipta Sarkar, Aditi Talati, Andy Shih, Dorsa Sadigh

We present PantheonRL, a multiagent reinforcement learning software package for dynamic training interactions such as round-robin, adaptive, and ad-hoc training.

reinforcement-learning Reinforcement Learning (RL)

HyperSPNs: Compact and Expressive Probabilistic Circuits

1 code implementation NeurIPS 2021 Andy Shih, Dorsa Sadigh, Stefano Ermon

Probabilistic circuits (PCs) are a family of generative models which allows for the computation of exact likelihoods and marginals of its probability distributions.

Density Estimation

LILA: Language-Informed Latent Actions

1 code implementation5 Nov 2021 Siddharth Karamcheti, Megha Srivastava, Percy Liang, Dorsa Sadigh

We introduce Language-Informed Latent Actions (LILA), a framework for learning natural language interfaces in the context of human-robot collaboration.

Imitation Learning

Learning Feasibility to Imitate Demonstrators with Different Dynamics

2 code implementations28 Oct 2021 Zhangjie Cao, Yilun Hao, Mengxi Li, Dorsa Sadigh

The goal of learning from demonstrations is to learn a policy for an agent (imitator) by mimicking the behavior in the demonstrations.

Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

2 code implementations NeurIPS 2021 Songyuan Zhang, Zhangjie Cao, Dorsa Sadigh, Yanan Sui

Our results show that CAIL significantly outperforms other imitation learning methods from demonstrations with varying optimality.

Imitation Learning

Influencing Towards Stable Multi-Agent Interactions

no code implementations5 Oct 2021 Woodrow Z. Wang, Andy Shih, Annie Xie, Dorsa Sadigh

Instead of reactively adapting to the other agent's (opponent or partner) behavior, we propose an algorithm to proactively influence the other agent's strategy to stabilize -- which can restrain the non-stationarity caused by the other agent.

Autonomous Driving

Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams

no code implementations2 Oct 2021 Erdem Biyik, Anusha Lalitha, Rajarshi Saha, Andrea Goldsmith, Dorsa Sadigh

Our results show that the proposed partner-aware strategy outperforms other known methods, and our human subject studies suggest humans prefer to collaborate with AI agents implementing our partner-aware strategy.

Decision Making Sequential Decision Making

Learning Reward Functions from Scale Feedback

1 code implementation1 Oct 2021 Nils Wilde, Erdem Biyik, Dorsa Sadigh, Stephen L. Smith

Today's robots are increasingly interacting with people and need to efficiently learn inexperienced user's preferences.

Learning Multimodal Rewards from Rankings

no code implementations27 Sep 2021 Vivek Myers, Erdem Biyik, Nima Anari, Dorsa Sadigh

However, expert feedback is often assumed to be drawn from an underlying unimodal reward function.

APReL: A Library for Active Preference-based Reward Learning Algorithms

1 code implementation16 Aug 2021 Erdem Biyik, Aditi Talati, Dorsa Sadigh

Reward learning is a fundamental problem in human-robot interaction to have robots that operate in alignment with what their human user wants.

On the Opportunities and Risks of Foundation Models

2 code implementations16 Aug 2021 Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

Targeted Data Acquisition for Evolving Negotiation Agents

no code implementations14 Jun 2021 Minae Kwon, Siddharth Karamcheti, Mariano-Florentino Cuellar, Dorsa Sadigh

This trend additionally holds when comparing agents using our targeted data acquisition framework to variants of agents trained with a mix of supervised learning and reinforcement learning, or to agents using tailored reward functions that explicitly optimize for utility and Pareto-optimality.

reinforcement-learning Reinforcement Learning +1

Emergent Prosociality in Multi-Agent Games Through Gifting

no code implementations13 May 2021 Woodrow Z. Wang, Mark Beliaev, Erdem Biyik, Daniel A. Lazar, Ramtin Pedarsani, Dorsa Sadigh

Coordination is often critical to forming prosocial behaviors -- behaviors that increase the overall sum of rewards received by all agents in a multi-agent game.

Learning Visually Guided Latent Actions for Assistive Teleoperation

1 code implementation2 May 2021 Siddharth Karamcheti, Albert J. Zhai, Dylan P. Losey, Dorsa Sadigh

In this work, we develop assistive robots that condition their latent embeddings on visual inputs.

On the Critical Role of Conventions in Adaptive Human-AI Collaboration

1 code implementation ICLR 2021 Andy Shih, Arjun Sawhney, Jovana Kondic, Stefano Ermon, Dorsa Sadigh

Humans can quickly adapt to new partners in collaborative tasks (e. g. playing basketball), because they understand which fundamental skills of the task (e. g. how to dribble, how to shoot) carry over across new partners.

ELLA: Exploration through Learned Language Abstraction

1 code implementation NeurIPS 2021 Suvir Mirchandani, Siddharth Karamcheti, Dorsa Sadigh

Building agents capable of understanding language instructions is critical to effective and robust human-AI collaboration.

Learning from Imperfect Demonstrations from Agents with Varying Dynamics

1 code implementation10 Mar 2021 Zhangjie Cao, Dorsa Sadigh

The proposed score enables learning from more informative demonstrations, and disregarding the less relevant demonstrations.

Imitation Learning

Transfer Reinforcement Learning across Homotopy Classes

no code implementations10 Feb 2021 Zhangjie Cao, Minae Kwon, Dorsa Sadigh

The ability for robots to transfer their learned knowledge to new tasks -- where data is scarce -- is a fundamental challenge for successful robot learning.

Transfer Reinforcement Learning Robotics

Incentivizing Routing Choices for Safe and Efficient Transportation in the Face of the COVID-19 Pandemic

no code implementations28 Dec 2020 Mark Beliaev, Erdem Biyik, Daniel A. Lazar, Woodrow Z. Wang, Dorsa Sadigh, Ramtin Pedarsani

In turn, significant increases in traffic congestion are expected, since people are likely to prefer using their own vehicles or taxis as opposed to riskier and more crowded options such as the railway.

Learning Latent Representations to Influence Multi-Agent Interaction

no code implementations12 Nov 2020 Annie Xie, Dylan P. Losey, Ryan Tolsma, Chelsea Finn, Dorsa Sadigh

We propose a reinforcement learning-based framework for learning latent representations of an agent's policy, where the ego agent identifies the relationship between its behavior and the other agent's future strategy.

Learning Adaptive Language Interfaces through Decomposition

no code implementations EMNLP (intexsempar) 2020 Siddharth Karamcheti, Dorsa Sadigh, Percy Liang

Our goal is to create an interactive natural language interface that efficiently and reliably learns from users to complete tasks in simulated robotics settings.

Semantic Parsing

Multi-Agent Safe Planning with Gaussian Processes

no code implementations10 Aug 2020 Zheqing Zhu, Erdem Biyik, Dorsa Sadigh

Multi-agent safe systems have become an increasingly important area of study as we can now easily have multiple AI-powered systems operating together.

Gaussian Processes

Learning User-Preferred Mappings for Intuitive Robot Control

no code implementations22 Jul 2020 Mengxi Li, Dylan P. Losey, Jeannette Bohg, Dorsa Sadigh

Existing approaches to teleoperation typically assume a one-size-fits-all approach, where the designers pre-define a mapping between human inputs and robot actions, and every user must adapt to this mapping over repeated interactions.

Robot Manipulation

Reinforcement Learning based Control of Imitative Policies for Near-Accident Driving

1 code implementation1 Jul 2020 Zhangjie Cao, Erdem Biyik, Woodrow Z. Wang, Allan Raventos, Adrien Gaidon, Guy Rosman, Dorsa Sadigh

To address driving in near-accident scenarios, we propose a hierarchical reinforcement and imitation learning (H-ReIL) approach that consists of low-level policies learned by IL for discrete driving modes, and a high-level policy learned by RL that switches between different driving modes.

Autonomous Driving Imitation Learning +2

Learning Reward Functions from Diverse Sources of Human Feedback: Optimally Integrating Demonstrations and Preferences

no code implementations24 Jun 2020 Erdem Biyik, Dylan P. Losey, Malayandi Palan, Nicholas C. Landolfi, Gleb Shevchuk, Dorsa Sadigh

As designing reward functions can be extremely challenging, a more promising approach is to directly learn reward functions from human teachers.

Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal Constraints

1 code implementation27 May 2020 Shushman Choudhury, Jayesh K. Gupta, Mykel J. Kochenderfer, Dorsa Sadigh, Jeannette Bohg

We consider the problem of dynamically allocating tasks to multiple agents under time window constraints and task completion uncertainty.

Decision Making Decision Making Under Uncertainty +2

Active Preference-Based Gaussian Process Regression for Reward Learning

1 code implementation6 May 2020 Erdem Biyik, Nicolas Huynh, Mykel J. Kochenderfer, Dorsa Sadigh

Our results in simulations and a user study suggest that our approach can efficiently learn expressive reward functions for robotics tasks.

regression

BLEU Neighbors: A Reference-less Approach to Automatic Evaluation

no code implementations EMNLP (Eval4NLP) 2020 Kawin Ethayarajh, Dorsa Sadigh

To this end, we propose BLEU Neighbors, a nearest neighbors model for estimating language quality by using the BLEU score as a kernel function.

Diversity Machine Translation +3

Exchangeable Input Representations for Reinforcement Learning

no code implementations19 Mar 2020 John Mern, Dorsa Sadigh, Mykel J. Kochenderfer

We show that our proposed representation results in an input space that is a factor of $m!$ smaller for inputs of $m$ objects.

Deep Reinforcement Learning Policy Gradient Methods +2

When Humans Aren't Optimal: Robots that Collaborate with Risk-Aware Humans

no code implementations13 Jan 2020 Minae Kwon, Erdem Biyik, Aditi Talati, Karan Bhasin, Dylan P. Losey, Dorsa Sadigh

Overall, we extend existing rational human models so that collaborative robots can anticipate and plan around suboptimal human behavior during HRI.

Learning from My Partner's Actions: Roles in Decentralized Robot Teams

no code implementations16 Oct 2019 Dylan P. Losey, Mengxi Li, Jeannette Bohg, Dorsa Sadigh

When teams of robots collaborate to complete a task, communication is often necessary.

Controlling Assistive Robots with Learned Latent Actions

no code implementations20 Sep 2019 Dylan P. Losey, Krishnan Srinivasan, Ajay Mandlekar, Animesh Garg, Dorsa Sadigh

Our insight is that we can make assistive robots easier for humans to control by leveraging latent actions.

Robotics

Learning Reward Functions by Integrating Human Demonstrations and Preferences

1 code implementation21 Jun 2019 Malayandi Palan, Nicholas C. Landolfi, Gleb Shevchuk, Dorsa Sadigh

In a user study, we compare our method to a standard IRL method; we find that users rated the robot trained with DemPref as being more successful at learning their desired behavior, and preferred to use the DemPref system (over IRL) to train the robot.

Reinforcement Learning

Batch Active Learning Using Determinantal Point Processes

1 code implementation19 Jun 2019 Erdem Biyik, Kenneth Wang, Nima Anari, Dorsa Sadigh

While active learning methods attempt to tackle this issue by labeling only the data samples that give high information, they generally suffer from large computational costs and are impractical in settings where data can be collected in parallel.

Active Learning Diversity +1

Object Exchangeability in Reinforcement Learning: Extended Abstract

no code implementations7 May 2019 John Mern, Dorsa Sadigh, Mykel Kochenderfer

Although deep reinforcement learning has advanced significantly over the past several years, sample efficiency remains a major challenge.

Deep Reinforcement Learning Object +3

Unsupervised Visuomotor Control through Distributional Planning Networks

1 code implementation14 Feb 2019 Tianhe Yu, Gleb Shevchuk, Dorsa Sadigh, Chelsea Finn

While reinforcement learning (RL) has the potential to enable robots to autonomously acquire a wide range of skills, in practice, RL usually requires manual, per-task engineering of reward functions, especially in real world settings where aspects of the environment needed to compute progress are not directly accessible.

reinforcement-learning Reinforcement Learning +1

Hierarchical Game-Theoretic Planning for Autonomous Vehicles

no code implementations13 Oct 2018 Jaime F. Fisac, Eli Bronstein, Elis Stefansson, Dorsa Sadigh, S. Shankar Sastry, Anca D. Dragan

This mutual dependence, best captured by dynamic game theory, creates a strong coupling between the vehicle's planning and its predictions of other drivers' behavior, and constitutes an open problem with direct implications on the safety and viability of autonomous driving technology.

Autonomous Driving Decision Making +1

Batch Active Preference-Based Learning of Reward Functions

1 code implementation10 Oct 2018 Erdem Biyik, Dorsa Sadigh

Data generation and labeling are usually an expensive part of learning for robotics.

Active Learning

Multi-Agent Generative Adversarial Imitation Learning

1 code implementation NeurIPS 2018 Jiaming Song, Hongyu Ren, Dorsa Sadigh, Stefano Ermon

Imitation learning algorithms can be used to learn a policy from expert demonstrations without access to a reward signal.

Imitation Learning reinforcement-learning +2

Towards Verified Artificial Intelligence

no code implementations27 Jun 2016 Sanjit A. Seshia, Dorsa Sadigh, S. Shankar Sastry

Verified artificial intelligence (AI) is the goal of designing AI-based systems that that have strong, ideally provable, assurances of correctness with respect to mathematically-specified requirements.

Safe Control under Uncertainty

no code implementations25 Oct 2015 Dorsa Sadigh, Ashish Kapoor

In this paper, we propose a new logic, Probabilistic Signal Temporal Logic (PrSTL), as an expressive language to define the stochastic properties, and enforce probabilistic guarantees on them.

Autonomous Vehicles

Cannot find the paper you are looking for? You can Submit a new open access paper.