Search Results for author: Dorsa Sadigh

Found 56 papers, 23 papers with code

Training and Inference on Any-Order Autoregressive Models the Right Way

no code implementations26 May 2022 Andy Shih, Dorsa Sadigh, Stefano Ermon

Conditional inference on arbitrary subsets of variables is a core problem in probabilistic inference with important applications such as masked language modeling and image inpainting.

Image Inpainting Language Modelling +1

Leveraging Smooth Attention Prior for Multi-Agent Trajectory Prediction

no code implementations8 Mar 2022 Zhangjie Cao, Erdem Biyik, Guy Rosman, Dorsa Sadigh

At a certain time, to forecast a reasonable future trajectory, each agent needs to pay attention to the interactions with only a small group of most relevant agents instead of unnecessarily paying attention to all the other agents.

Trajectory Prediction

Weakly Supervised Correspondence Learning

no code implementations2 Mar 2022 Zihan Wang, Zhangjie Cao, Yilun Hao, Dorsa Sadigh

Correspondence learning is a fundamental problem in robotics, which aims to learn a mapping between state, action pairs of agents of different dynamics or embodiments.

Learning from Imperfect Demonstrations via Adversarial Confidence Transfer

no code implementations7 Feb 2022 Zhangjie Cao, Zihan Wang, Dorsa Sadigh

Existing learning from demonstration algorithms usually assume access to expert demonstrations.

Imitation Learning by Estimating Expertise of Demonstrators

1 code implementation2 Feb 2022 Mark Beliaev, Andy Shih, Stefano Ermon, Dorsa Sadigh, Ramtin Pedarsani

In this work, we show that unsupervised learning over demonstrator expertise can lead to a consistent boost in the performance of imitation learning algorithms.

Continuous Control Imitation Learning

Conditional Imitation Learning for Multi-Agent Games

no code implementations5 Jan 2022 Andy Shih, Stefano Ermon, Dorsa Sadigh

In this work, we study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time, and we must interact with and adapt to new partners at test time.

Imitation Learning Tensor Decomposition

PantheonRL: A MARL Library for Dynamic Training Interactions

1 code implementation13 Dec 2021 Bidipta Sarkar, Aditi Talati, Andy Shih, Dorsa Sadigh

We present PantheonRL, a multiagent reinforcement learning software package for dynamic training interactions such as round-robin, adaptive, and ad-hoc training.

reinforcement-learning

HyperSPNs: Compact and Expressive Probabilistic Circuits

1 code implementation NeurIPS 2021 Andy Shih, Dorsa Sadigh, Stefano Ermon

Probabilistic circuits (PCs) are a family of generative models which allows for the computation of exact likelihoods and marginals of its probability distributions.

Density Estimation

LILA: Language-Informed Latent Actions

1 code implementation5 Nov 2021 Siddharth Karamcheti, Megha Srivastava, Percy Liang, Dorsa Sadigh

We introduce Language-Informed Latent Actions (LILA), a framework for learning natural language interfaces in the context of human-robot collaboration.

Imitation Learning

Learning Feasibility to Imitate Demonstrators with Different Dynamics

1 code implementation28 Oct 2021 Zhangjie Cao, Yilun Hao, Mengxi Li, Dorsa Sadigh

The goal of learning from demonstrations is to learn a policy for an agent (imitator) by mimicking the behavior in the demonstrations.

Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

2 code implementations NeurIPS 2021 Songyuan Zhang, Zhangjie Cao, Dorsa Sadigh, Yanan Sui

Our results show that CAIL significantly outperforms other imitation learning methods from demonstrations with varying optimality.

Imitation Learning

Influencing Towards Stable Multi-Agent Interactions

no code implementations5 Oct 2021 Woodrow Z. Wang, Andy Shih, Annie Xie, Dorsa Sadigh

Instead of reactively adapting to the other agent's (opponent or partner) behavior, we propose an algorithm to proactively influence the other agent's strategy to stabilize -- which can restrain the non-stationarity caused by the other agent.

Autonomous Driving

Partner-Aware Algorithms in Decentralized Cooperative Bandit Teams

no code implementations2 Oct 2021 Erdem Biyik, Anusha Lalitha, Rajarshi Saha, Andrea Goldsmith, Dorsa Sadigh

Our results show that the proposed partner-aware strategy outperforms other known methods, and our human subject studies suggest humans prefer to collaborate with AI agents implementing our partner-aware strategy.

Decision Making

Learning Reward Functions from Scale Feedback

1 code implementation1 Oct 2021 Nils Wilde, Erdem Biyik, Dorsa Sadigh, Stephen L. Smith

Today's robots are increasingly interacting with people and need to efficiently learn inexperienced user's preferences.

Learning Multimodal Rewards from Rankings

no code implementations27 Sep 2021 Vivek Myers, Erdem Biyik, Nima Anari, Dorsa Sadigh

However, expert feedback is often assumed to be drawn from an underlying unimodal reward function.

On the Opportunities and Risks of Foundation Models

no code implementations16 Aug 2021 Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Kohd, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, aditi raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

AI is undergoing a paradigm shift with the rise of models (e. g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.

Transfer Learning

APReL: A Library for Active Preference-based Reward Learning Algorithms

1 code implementation16 Aug 2021 Erdem Biyik, Aditi Talati, Dorsa Sadigh

Reward learning is a fundamental problem in human-robot interaction to have robots that operate in alignment with what their human user wants.

Targeted Data Acquisition for Evolving Negotiation Agents

no code implementations14 Jun 2021 Minae Kwon, Siddharth Karamcheti, Mariano-Florentino Cuellar, Dorsa Sadigh

This trend additionally holds when comparing agents using our targeted data acquisition framework to variants of agents trained with a mix of supervised learning and reinforcement learning, or to agents using tailored reward functions that explicitly optimize for utility and Pareto-optimality.

reinforcement-learning

Emergent Prosociality in Multi-Agent Games Through Gifting

no code implementations13 May 2021 Woodrow Z. Wang, Mark Beliaev, Erdem Biyik, Daniel A. Lazar, Ramtin Pedarsani, Dorsa Sadigh

Coordination is often critical to forming prosocial behaviors -- behaviors that increase the overall sum of rewards received by all agents in a multi-agent game.

Learning Visually Guided Latent Actions for Assistive Teleoperation

1 code implementation2 May 2021 Siddharth Karamcheti, Albert J. Zhai, Dylan P. Losey, Dorsa Sadigh

In this work, we develop assistive robots that condition their latent embeddings on visual inputs.

On the Critical Role of Conventions in Adaptive Human-AI Collaboration

1 code implementation ICLR 2021 Andy Shih, Arjun Sawhney, Jovana Kondic, Stefano Ermon, Dorsa Sadigh

Humans can quickly adapt to new partners in collaborative tasks (e. g. playing basketball), because they understand which fundamental skills of the task (e. g. how to dribble, how to shoot) carry over across new partners.

Learning from Imperfect Demonstrations from Agents with Varying Dynamics

1 code implementation10 Mar 2021 Zhangjie Cao, Dorsa Sadigh

The proposed score enables learning from more informative demonstrations, and disregarding the less relevant demonstrations.

Imitation Learning

ELLA: Exploration through Learned Language Abstraction

1 code implementation NeurIPS 2021 Suvir Mirchandani, Siddharth Karamcheti, Dorsa Sadigh

Building agents capable of understanding language instructions is critical to effective and robust human-AI collaboration.

Transfer Reinforcement Learning across Homotopy Classes

no code implementations10 Feb 2021 Zhangjie Cao, Minae Kwon, Dorsa Sadigh

The ability for robots to transfer their learned knowledge to new tasks -- where data is scarce -- is a fundamental challenge for successful robot learning.

Transfer Reinforcement Learning Robotics

Incentivizing Routing Choices for Safe and Efficient Transportation in the Face of the COVID-19 Pandemic

no code implementations28 Dec 2020 Mark Beliaev, Erdem Biyik, Daniel A. Lazar, Woodrow Z. Wang, Dorsa Sadigh, Ramtin Pedarsani

In turn, significant increases in traffic congestion are expected, since people are likely to prefer using their own vehicles or taxis as opposed to riskier and more crowded options such as the railway.

Learning Latent Representations to Influence Multi-Agent Interaction

no code implementations12 Nov 2020 Annie Xie, Dylan P. Losey, Ryan Tolsma, Chelsea Finn, Dorsa Sadigh

We propose a reinforcement learning-based framework for learning latent representations of an agent's policy, where the ego agent identifies the relationship between its behavior and the other agent's future strategy.

reinforcement-learning

Learning Adaptive Language Interfaces through Decomposition

no code implementations EMNLP (intexsempar) 2020 Siddharth Karamcheti, Dorsa Sadigh, Percy Liang

Our goal is to create an interactive natural language interface that efficiently and reliably learns from users to complete tasks in simulated robotics settings.

Semantic Parsing

Multi-Agent Safe Planning with Gaussian Processes

no code implementations10 Aug 2020 Zheqing Zhu, Erdem Biyik, Dorsa Sadigh

Multi-agent safe systems have become an increasingly important area of study as we can now easily have multiple AI-powered systems operating together.

Gaussian Processes

Learning User-Preferred Mappings for Intuitive Robot Control

no code implementations22 Jul 2020 Mengxi Li, Dylan P. Losey, Jeannette Bohg, Dorsa Sadigh

Existing approaches to teleoperation typically assume a one-size-fits-all approach, where the designers pre-define a mapping between human inputs and robot actions, and every user must adapt to this mapping over repeated interactions.

Reinforcement Learning based Control of Imitative Policies for Near-Accident Driving

1 code implementation1 Jul 2020 Zhangjie Cao, Erdem Biyik, Woodrow Z. Wang, Allan Raventos, Adrien Gaidon, Guy Rosman, Dorsa Sadigh

To address driving in near-accident scenarios, we propose a hierarchical reinforcement and imitation learning (H-ReIL) approach that consists of low-level policies learned by IL for discrete driving modes, and a high-level policy learned by RL that switches between different driving modes.

Autonomous Driving Imitation Learning +1

Learning Reward Functions from Diverse Sources of Human Feedback: Optimally Integrating Demonstrations and Preferences

no code implementations24 Jun 2020 Erdem Biyik, Dylan P. Losey, Malayandi Palan, Nicholas C. Landolfi, Gleb Shevchuk, Dorsa Sadigh

As designing reward functions can be extremely challenging, a more promising approach is to directly learn reward functions from human teachers.

Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal Constraints

1 code implementation27 May 2020 Shushman Choudhury, Jayesh K. Gupta, Mykel J. Kochenderfer, Dorsa Sadigh, Jeannette Bohg

We consider the problem of dynamically allocating tasks to multiple agents under time window constraints and task completion uncertainty.

Decision Making Decision Making Under Uncertainty

Active Preference-Based Gaussian Process Regression for Reward Learning

1 code implementation6 May 2020 Erdem Biyik, Nicolas Huynh, Mykel J. Kochenderfer, Dorsa Sadigh

Our results in simulations and a user study suggest that our approach can efficiently learn expressive reward functions for robotics tasks.

BLEU Neighbors: A Reference-less Approach to Automatic Evaluation

no code implementations EMNLP (Eval4NLP) 2020 Kawin Ethayarajh, Dorsa Sadigh

To this end, we propose BLEU Neighbors, a nearest neighbors model for estimating language quality by using the BLEU score as a kernel function.

Machine Translation Text Generation +1

Exchangeable Input Representations for Reinforcement Learning

no code implementations19 Mar 2020 John Mern, Dorsa Sadigh, Mykel J. Kochenderfer

We show that our proposed representation results in an input space that is a factor of $m!$ smaller for inputs of $m$ objects.

Policy Gradient Methods reinforcement-learning

When Humans Aren't Optimal: Robots that Collaborate with Risk-Aware Humans

no code implementations13 Jan 2020 Minae Kwon, Erdem Biyik, Aditi Talati, Karan Bhasin, Dylan P. Losey, Dorsa Sadigh

Overall, we extend existing rational human models so that collaborative robots can anticipate and plan around suboptimal human behavior during HRI.

Learning from My Partner's Actions: Roles in Decentralized Robot Teams

no code implementations16 Oct 2019 Dylan P. Losey, Mengxi Li, Jeannette Bohg, Dorsa Sadigh

When teams of robots collaborate to complete a task, communication is often necessary.

Controlling Assistive Robots with Learned Latent Actions

no code implementations20 Sep 2019 Dylan P. Losey, Krishnan Srinivasan, Ajay Mandlekar, Animesh Garg, Dorsa Sadigh

Our insight is that we can make assistive robots easier for humans to control by leveraging latent actions.

Robotics

Learning Reward Functions by Integrating Human Demonstrations and Preferences

1 code implementation21 Jun 2019 Malayandi Palan, Nicholas C. Landolfi, Gleb Shevchuk, Dorsa Sadigh

In a user study, we compare our method to a standard IRL method; we find that users rated the robot trained with DemPref as being more successful at learning their desired behavior, and preferred to use the DemPref system (over IRL) to train the robot.

Batch Active Learning Using Determinantal Point Processes

1 code implementation19 Jun 2019 Erdem Biyik, Kenneth Wang, Nima Anari, Dorsa Sadigh

While active learning methods attempt to tackle this issue by labeling only the data samples that give high information, they generally suffer from large computational costs and are impractical in settings where data can be collected in parallel.

Active Learning Point Processes

Object Exchangeability in Reinforcement Learning: Extended Abstract

no code implementations7 May 2019 John Mern, Dorsa Sadigh, Mykel Kochenderfer

Although deep reinforcement learning has advanced significantly over the past several years, sample efficiency remains a major challenge.

Policy Gradient Methods reinforcement-learning

Unsupervised Visuomotor Control through Distributional Planning Networks

1 code implementation14 Feb 2019 Tianhe Yu, Gleb Shevchuk, Dorsa Sadigh, Chelsea Finn

While reinforcement learning (RL) has the potential to enable robots to autonomously acquire a wide range of skills, in practice, RL usually requires manual, per-task engineering of reward functions, especially in real world settings where aspects of the environment needed to compute progress are not directly accessible.

reinforcement-learning

Hierarchical Game-Theoretic Planning for Autonomous Vehicles

no code implementations13 Oct 2018 Jaime F. Fisac, Eli Bronstein, Elis Stefansson, Dorsa Sadigh, S. Shankar Sastry, Anca D. Dragan

This mutual dependence, best captured by dynamic game theory, creates a strong coupling between the vehicle's planning and its predictions of other drivers' behavior, and constitutes an open problem with direct implications on the safety and viability of autonomous driving technology.

Autonomous Driving Decision Making +1

Batch Active Preference-Based Learning of Reward Functions

1 code implementation10 Oct 2018 Erdem Biyik, Dorsa Sadigh

Data generation and labeling are usually an expensive part of learning for robotics.

Active Learning

Multi-Agent Generative Adversarial Imitation Learning

1 code implementation NeurIPS 2018 Jiaming Song, Hongyu Ren, Dorsa Sadigh, Stefano Ermon

Imitation learning algorithms can be used to learn a policy from expert demonstrations without access to a reward signal.

Imitation Learning reinforcement-learning

Towards Verified Artificial Intelligence

no code implementations27 Jun 2016 Sanjit A. Seshia, Dorsa Sadigh, S. Shankar Sastry

Verified artificial intelligence (AI) is the goal of designing AI-based systems that that have strong, ideally provable, assurances of correctness with respect to mathematically-specified requirements.

Safe Control under Uncertainty

no code implementations25 Oct 2015 Dorsa Sadigh, Ashish Kapoor

In this paper, we propose a new logic, Probabilistic Signal Temporal Logic (PrSTL), as an expressive language to define the stochastic properties, and enforce probabilistic guarantees on them.

Autonomous Vehicles

Cannot find the paper you are looking for? You can Submit a new open access paper.