no code implementations • 30 Oct 2024 • Qinqing Zheng, Mikael Henaff, Amy Zhang, Aditya Grover, Brandon Amos
Our approach annotates the agent's collected experience via an asynchronous LLM server, which is then distilled into an intrinsic reward model.
no code implementations • 24 Oct 2024 • Zizhao Wang, Jiaheng Hu, Caleb Chuck, Stephen Chen, Roberto Martín-Martín, Amy Zhang, Scott Niekum, Peter Stone
However, in complex environments with many state factors (e. g., household environments with many objects), learning skills that cover all possible states is impossible, and naively encouraging state diversity often leads to simple skills that are not ideal for solving downstream tasks.
1 code implementation • 3 Oct 2024 • Alexander Levine, Peter Stone, Amy Zhang
Efroni et al. (2022b) has shown that this is possible with a sample complexity that depends only on the size of the controllable latent space, and not on the size of the noise factor.
no code implementations • 16 Aug 2024 • Mohamad Fares El Hajj Chehade, Amrit Singh Bedi, Amy Zhang, Hao Zhu
To the best of our knowledge, this is the first work to explore the optimization of such a generalized risk notion within the context of transfer RL.
no code implementations • 29 Jul 2024 • Liyuan Mao, Haoran Xu, Xianyuan Zhan, Weinan Zhang, Amy Zhang
In this work, we show that DICE-based methods can be viewed as a transformation from the behavior distribution to the optimal policy distribution.
2 code implementations • 25 Jun 2024 • Philippe Hansen-Estruch, Sriram Vishwanath, Amy Zhang, Manan Tomar
At the core of both successful generative and self-supervised representation learning models there is a reconstruction objective that incorporates some form of image corruption.
no code implementations • 13 Jun 2024 • Harshit Sikchi, Caleb Chuck, Amy Zhang, Scott Niekum
DILO reduces the learning from observations problem to that of simply learning an actor and a critic, bearing similar complexity to vanilla offline RL.
no code implementations • 6 May 2024 • Caleb Chuck, Carl Qi, Michael J. Munje, Shuozhe Li, Max Rudolph, Chang Shi, Siddhant Agarwal, Harshit Sikchi, Abhinav Peri, Sarthak Dayal, Evan Kuo, Kavan Mehta, Anthony Wang, Peter Stone, Amy Zhang, Scott Niekum
Reinforcement Learning is a promising tool for learning complex policies even in fast-moving and object-interactive domains where human teleoperation or hard-coded policies might fail.
no code implementations • 16 Apr 2024 • Caleb Chuck, Sankaran Vaidyanathan, Stephen Giguere, Amy Zhang, David Jensen, Scott Niekum
This paper introduces functional actual cause (FAC), a framework that uses context-specific independencies in the environment to restrict the set of actual causes.
no code implementations • 25 Mar 2024 • Max Rudolph, Caleb Chuck, Kevin Black, Misha Lvovsky, Scott Niekum, Amy Zhang
Robust reinforcement learning agents using high-dimensional observations must be able to identify relevant state features amidst many exogeneous distractors.
1 code implementation • 18 Mar 2024 • Alexander Levine, Peter Stone, Amy Zhang
In this work, we consider the Ex-BMDP model, first proposed by Efroni et al. (2022), which formalizes control problems where observations can be factorized into an action-dependent latent state which evolves deterministically, and action-independent time-correlated noise.
no code implementations • 5 Feb 2024 • Zihan Ding, Amy Zhang, Yuandong Tian, Qinqing Zheng
We introduce Diffusion World Model (DWM), a conditional diffusion model capable of predicting multistep future states and rewards concurrently.
2 code implementations • 30 Jan 2024 • Tyler Ingebrand, Amy Zhang, Ufuk Topcu
Although reinforcement learning (RL) can solve many challenging sequential decision making problems, achieving zero-shot transfer across related tasks remains a challenge.
1 code implementation • 28 Nov 2023 • Brett Barkley, Amy Zhang, David Fridovich-Keil
We observe that utilizing the structure of time reversal in an MDP allows every environment transition experienced by an agent to be transformed into a feasible reverse-time transition, effectively doubling the number of experiences in the environment.
1 code implementation • NeurIPS 2023 • Qiyang Li, Jason Zhang, Dibya Ghosh, Amy Zhang, Sergey Levine
Learning to solve tasks from a sparse reward signal is a major challenge for standard reinforcement learning (RL) algorithms.
no code implementations • 3 Nov 2023 • Harshit Sikchi, Rohan Chitnis, Ahmed Touati, Alborz Geramifard, Amy Zhang, Scott Niekum
Offline Goal-Conditioned Reinforcement Learning (GCRL) is tasked with learning to achieve multiple goals in an environment purely from offline datasets using sparse reward functions.
2 code implementations • 19 Oct 2023 • Rui Yang, Han Zhong, Jiawei Xu, Amy Zhang, Chongjie Zhang, Lei Han, Tong Zhang
Offline reinforcement learning (RL) presents a promising approach for learning reinforced policies from offline datasets without the need for costly or unsafe interactions with the environment.
no code implementations • 10 Oct 2023 • Siddhant Agarwal, Ishan Durugkar, Peter Stone, Amy Zhang
We further introduce an entropy-regularized policy optimization objective, that we call $state$-MaxEnt RL (or $s$-MaxEnt RL) as a special case of our objective.
1 code implementation • 29 Sep 2023 • Martin Klissarov, Pierluca D'Oro, Shagun Sodhani, Roberta Raileanu, Pierre-Luc Bacon, Pascal Vincent, Amy Zhang, Mikael Henaff
Exploring rich environments and evaluating one's actions without prior knowledge is immensely challenging.
1 code implementation • 15 Aug 2023 • Andre Ye, Quan Ze Chen, Amy Zhang
However, as these annotations cannot represent an individual annotator's uncertainty, models trained on them produce uncertainty maps that are difficult to interpret.
no code implementations • 28 Jun 2023 • Aditya Mohan, Amy Zhang, Marius Lindauer
We amalgamate these diverse methodologies under a unified framework, shedding light on the role of structure in the learning problem, and classify these methods into distinct patterns of incorporating structure.
no code implementations • 7 Jun 2023 • Anuj Mahajan, Amy Zhang
We focus on bisimulation metrics, which provide a powerful means for abstracting task relevant components of the observation and learning a succinct representation space for training the agent using reinforcement learning.
1 code implementation • 1 Jun 2023 • Yecheng Jason Ma, William Liang, Vaidehi Som, Vikash Kumar, Amy Zhang, Osbert Bastani, Dinesh Jayaraman
We present Language-Image Value learning (LIV), a unified objective for vision-language representation and reward learning from action-free videos with text annotations.
no code implementations • 26 May 2023 • Paul Barde, Jakob Foerster, Derek Nowrouzezahrai, Amy Zhang
Training multiple agents to coordinate is an essential problem with applications in robotics, game theory, economics, and social sciences.
1 code implementation • 23 May 2023 • Prajjwal Bhargava, Rohan Chitnis, Alborz Geramifard, Shagun Sodhani, Amy Zhang
Three popular algorithms for offline RL are Conservative Q-Learning (CQL), Behavior Cloning (BC), and Decision Transformer (DT), from the class of Q-Learning, Imitation Learning, and Sequence Modeling respectively.
1 code implementation • 3 Apr 2023 • Tongzhou Wang, Antonio Torralba, Phillip Isola, Amy Zhang
In goal-reaching reinforcement learning (RL), the optimal value function has a particular geometry, called quasimetric structure.
no code implementations • 28 Mar 2023 • Andrew Szot, Amy Zhang, Dhruv Batra, Zsolt Kira, Franziska Meier
How well do reward functions learned with inverse reinforcement learning (IRL) generalize?
no code implementations • 20 Mar 2023 • Michael Chang, Alyssa L. Dayan, Franziska Meier, Thomas L. Griffiths, Sergey Levine, Amy Zhang
Object rearrangement is a challenge for embodied agents because solving these tasks requires generalizing across a combinatorially large set of configurations of entities and their locations.
no code implementations • 17 Mar 2023 • Qiaojie Zheng, Jiucai Zhang, Amy Zhang, Xiaoli Zhang
To address the unreliable and overconfident issues, we introduce a confidence-aware model that predicts uncertainties together with gaze angle estimations.
1 code implementation • 16 Feb 2023 • Harshit Sikchi, Qinqing Zheng, Amy Zhang, Scott Niekum
For offline RL, our analysis frames a recent offline RL method XQL in the dual framework, and we further propose a new method f-DVL that provides alternative choices to the Gumbel regression loss that fixes the known training instability issue of XQL.
no code implementations • 21 Dec 2022 • Chris Lengerich, Gabriel Synnaeve, Amy Zhang, Hugh Leather, Kurt Shuster, François Charton, Charysse Redwood
Traditional approaches to RL have focused on learning decision policies directly from episodic decisions, while slowly and implicitly learning the semantics of compositional representations needed for generalization.
1 code implementation • 27 Oct 2022 • Edwin Zhang, Yujie Lu, William Wang, Amy Zhang
Training generalist agents is difficult across several axes, requiring us to deal with high-dimensional inputs (space), long horizons (time), and generalization to novel tasks.
1 code implementation • 3 Oct 2022 • Dinghuai Zhang, Aaron Courville, Yoshua Bengio, Qinqing Zheng, Amy Zhang, Ricky T. Q. Chen
While the maximum entropy (MaxEnt) reinforcement learning (RL) framework -- often touted for its exploration and robustness capabilities -- is usually motivated from a probabilistic perspective, the use of deep probabilistic models has not gained much traction in practice due to their inherent complexity.
1 code implementation • 30 Sep 2022 • Yecheng Jason Ma, Shagun Sodhani, Dinesh Jayaraman, Osbert Bastani, Vikash Kumar, Amy Zhang
Given the inherent cost and scarcity of in-domain, task-specific robot data, learning from large, diverse, offline human videos has emerged as a promising path towards acquiring a generally useful visual representation for control; however, how these human videos can be used for general-purpose reward learning remains an open question.
no code implementations • 20 Jul 2022 • Jonathan Stray, Alon Halevy, Parisa Assar, Dylan Hadfield-Menell, Craig Boutilier, Amar Ashar, Lex Beattie, Michael Ekstrand, Claire Leibowicz, Connie Moon Sehat, Sara Johansen, Lianne Kerlin, David Vickrey, Spandana Singh, Sanne Vrijenhoek, Amy Zhang, McKane Andrus, Natali Helberger, Polina Proutskova, Tanushree Mitra, Nina Vasan
We collect a set of values that seem most relevant to recommender systems operating across different domains, then examine them from the perspectives of current industry practice, measurement, product design, and policy approaches.
1 code implementation • 30 Jun 2022 • Tongzhou Wang, Simon S. Du, Antonio Torralba, Phillip Isola, Amy Zhang, Yuandong Tian
The ability to separate signal from noise, and reason with clean abstractions, is critical to intelligence.
no code implementations • 27 Apr 2022 • Philippe Hansen-Estruch, Amy Zhang, Ashvin Nair, Patrick Yin, Sergey Levine
We learn this representation using a metric form of this abstraction, and show its ability to generalize to new goals in simulation manipulation tasks.
no code implementations • 14 Feb 2022 • Annie Xie, Shagun Sodhani, Chelsea Finn, Joelle Pineau, Amy Zhang
Reinforcement learning (RL) agents need to be robust to variations in safety-critical environments.
2 code implementations • 11 Feb 2022 • Qinqing Zheng, Amy Zhang, Aditya Grover
Recent work has shown that offline reinforcement learning (RL) can be formulated as a sequence modeling problem (Chen et al., 2021; Janner et al., 2021) and solved via approaches similar to large-scale language modeling.
no code implementations • 18 Nov 2021 • Robert Kirk, Amy Zhang, Edward Grefenstette, Tim Rocktäschel
This survey is an overview of this nascent field.
no code implementations • 15 Nov 2021 • Manan Tomar, Utkarsh A. Mishra, Amy Zhang, Matthew E. Taylor
A wide range of methods have been proposed to enable efficient learning, leading to sample complexities similar to those in the full state setting.
no code implementations • 13 Oct 2021 • Shagun Sodhani, Franziska Meier, Joelle Pineau, Amy Zhang
In this work, we propose to examine this continual reinforcement learning setting through the block contextual MDP (BC-MDP) framework, which enables us to relax the assumption of stationarity.
no code implementations • 29 Sep 2021 • Manan Tomar, Amy Zhang, Matthew E. Taylor
The common representation acts as a implicit invariance objective to avoid the different spurious correlations captured by individual predictors.
no code implementations • NeurIPS 2021 • Dibya Ghosh, Jad Rahme, Aviral Kumar, Amy Zhang, Ryan P. Adams, Sergey Levine
Generalization is a central challenge for the deployment of reinforcement learning (RL) systems in the real world.
no code implementations • 22 Jun 2021 • Weitong Zhang, Jiafan He, Dongruo Zhou, Amy Zhang, Quanquan Gu
For the offline counterpart, ReLEX-LCB, we show that the algorithm can find the optimal policy if the representation class can cover the state-action space and achieves gap-dependent sample complexity.
3 code implementations • 20 Apr 2021 • Luis Pineda, Brandon Amos, Amy Zhang, Nathan O. Lambert, Roberto Calandra
MBRL-Lib is designed as a platform for both researchers, to easily develop, debug and compare new algorithms, and non-expert user, to lower the entry-bar of deploying state-of-the-art algorithms.
Model-based Reinforcement Learning reinforcement-learning +2
no code implementations • ICLR Workshop SSL-RL 2021 • Clare Lyle, Amy Zhang, Minqi Jiang, Joelle Pineau, Yarin Gal
To address this, we present a robust exploration strategy which enables causal hypothesis-testing by interaction with the environment.
no code implementations • ICLR Workshop SSL-RL 2021 • Manan Tomar, Amy Zhang, Roberto Calandra, Matthew E. Taylor, Joelle Pineau
Unlike previous forms of state abstractions, a model-invariance state abstraction leverages causal sparsity over state variables.
2 code implementations • 11 Feb 2021 • Shagun Sodhani, Amy Zhang, Joelle Pineau
We posit that an efficient approach to knowledge transfer is through the use of multiple context-dependent, composable representations shared across a family of tasks.
no code implementations • ICLR 2021 • Amy Zhang, Rowan Thomas McAllister, Roberto Calandra, Yarin Gal, Sergey Levine
We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction.
no code implementations • 9 Dec 2020 • Xiaoxiao Li, Rabah Al-Zaidy, Amy Zhang, Stefan Baral, Le Bao, C. Lee Giles
Conclusions: In sum, the automated procedure of document classification presented here could improve both the precision and efficiency of systematic reviews, as well as facilitating live reviews, where reviews are updated regularly.
1 code implementation • 3 Dec 2020 • Melissa Mozifian, Amy Zhang, Joelle Pineau, David Meger
The goal of this work is to address the recent success of domain randomization and data augmentation for the sim2real setting.
2 code implementations • ICLR 2021 • Amy Zhang, Shagun Sodhani, Khimya Khetarpal, Joelle Pineau
Further, we provide transfer and generalization bounds based on task and state similarity, along with sample complexity bounds that depend on the aggregate number of samples across tasks, rather than the number of tasks, a significant improvement over prior work that use the same environment assumptions.
2 code implementations • 18 Jun 2020 • Amy Zhang, Rowan McAllister, Roberto Calandra, Yarin Gal, Sergey Levine
We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction.
1 code implementation • 7 May 2020 • Ge Yang, Amy Zhang, Ari S. Morcos, Joelle Pineau, Pieter Abbeel, Roberto Calandra
In this paper we introduce plan2vec, an unsupervised representation learning approach that is inspired by reinforcement learning.
1 code implementation • ICML 2020 • Amy Zhang, Clare Lyle, Shagun Sodhani, Angelos Filos, Marta Kwiatkowska, Joelle Pineau, Yarin Gal, Doina Precup
Generalization across environments is critical to the successful application of reinforcement learning algorithms to real-world challenges.
1 code implementation • 9 Mar 2020 • Ahmed Touati, Amy Zhang, Joelle Pineau, Pascal Vincent
Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) are among the most successful policy gradient approaches in deep reinforcement learning (RL).
4 code implementations • 2 Mar 2020 • David Krueger, Ethan Caballero, Joern-Henrik Jacobsen, Amy Zhang, Jonathan Binas, Dinghuai Zhang, Remi Le Priol, Aaron Courville
Distributional shift is one of the major obstacles when transferring machine learning prediction systems from the lab to the real world.
4 code implementations • 2 Oct 2019 • Denis Yarats, Amy Zhang, Ilya Kostrikov, Brandon Amos, Joelle Pineau, Rob Fergus
A promising approach is to learn a latent representation together with the control policy.
no code implementations • 25 Jun 2019 • Amy Zhang, Zachary C. Lipton, Luis Pineda, Kamyar Azizzadenesheli, Anima Anandkumar, Laurent Itti, Joelle Pineau, Tommaso Furlanello
In this paper, we propose an algorithm to approximate causal states, which are the coarsest partition of the joint history of actions and observations in partially-observable Markov decision processes (POMDP).
2 code implementations • 14 Nov 2018 • Amy Zhang, Yuxin Wu, Joelle Pineau
While current benchmark reinforcement learning (RL) tasks have been useful to drive progress in the field, they are in many ways poor substitutes for learning with real-world data.
no code implementations • 20 Jun 2018 • Amy Zhang, Nicolas Ballas, Joelle Pineau
The risks and perils of overfitting in machine learning are well known.
1 code implementation • 27 Apr 2018 • Amy Zhang, Harsh Satija, Joelle Pineau
Current reinforcement learning (RL) methods can successfully learn single tasks but often generalize poorly to modest perturbations in task domain or training procedure.
no code implementations • ICML 2018 • Amy Zhang, Adam Lerer, Sainbayar Sukhbaatar, Rob Fergus, Arthur Szlam
The tasks that an agent will need to solve often are not known during training.
no code implementations • 15 Dec 2017 • Tobias G. Tiecke, Xian-Ming Liu, Amy Zhang, Andreas Gros, Nan Li, Gregory Yetman, Talip Kilic, Siobhan Murray, Brian Blankespoor, Espen B. Prydz, Hai-Anh H. Dang
Obtaining high accuracy in estimation of population distribution in rural areas remains a very challenging task due to the simultaneous requirements of sufficient sensitivity and resolution to detect very sparse populations through remote sensing as well as reliable performance at a global scale.
no code implementations • 27 Jul 2017 • Amy Zhang, Xian-Ming Liu, Andreas Gros, Tobias Tiecke
Our work is some of the first to create population density maps from building detection on a large scale.
no code implementations • 8 Dec 2016 • Xianming Liu, Amy Zhang, Tobias Tiecke, Andreas Gros, Thomas S. Huang
Learning from weakly-supervised data is one of the main challenges in machine learning and computer vision, especially for tasks such as image semantic segmentation where labeling is extremely expensive and subjective.
no code implementations • 9 Aug 2014 • Amy Zhang, Nadia Fawaz, Stratis Ioannidis, Andrea Montanari
It is often the case that, within an online recommender system, multiple users share a common account.