no code implementations • 22 May 2023 • Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, Sergey Levine
Diffusion models are a class of flexible generative models trained with an approximation to the log-likelihood objective.
1 code implementation • 20 Apr 2023 • Philippe Hansen-Estruch, Ilya Kostrikov, Michael Janner, Jakub Grudzien Kuba, Sergey Levine
In this paper, we reinterpret IQL as an actor-critic method by generalizing the critic objective and connecting it to a behavior-regularized implicit actor.
no code implementations • 20 Apr 2023 • Qiyang Li, Aviral Kumar, Ilya Kostrikov, Sergey Levine
Deep reinforcement learning algorithms that learn policies by trial-and-error must learn from limited amounts of data collected by actively interacting with the environment.
no code implementations • 19 Apr 2023 • Kyle Stachowicz, Dhruv Shah, Arjun Bhorkar, Ilya Kostrikov, Sergey Levine
We present a system that enables an autonomous small-scale RC car to drive aggressively from visual observations using reinforcement learning (RL).
1 code implementation • 6 Feb 2023 • Philip J. Ball, Laura Smith, Ilya Kostrikov, Sergey Levine
Sample efficiency and exploration remain major challenges in online reinforcement learning (RL).
1 code implementation • 16 Dec 2022 • Dhruv Shah, Arjun Bhorkar, Hrish Leen, Ilya Kostrikov, Nick Rhinehart, Sergey Levine
Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass.
1 code implementation • 16 Aug 2022 • Laura Smith, Ilya Kostrikov, Sergey Levine
Deep reinforcement learning is a promising approach to learning policies in uncontrolled environments that do not require domain knowledge.
no code implementations • 5 Jun 2022 • Charlie Snell, Ilya Kostrikov, Yi Su, Mengjiao Yang, Sergey Levine
Large language models distill broad knowledge from text corpora.
1 code implementation • 11 Jan 2022 • Vitaly Kurin, Alessandro De Palma, Ilya Kostrikov, Shimon Whiteson, M. Pawan Kumar
We show that unitary scalarization, coupled with standard regularization and stabilization techniques from single-task learning, matches or improves upon the performance of complex multi-task optimizers in popular supervised and reinforcement learning settings.
1 code implementation • 20 Dec 2021 • Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine
Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL.
1 code implementation • NeurIPS 2021 • Roberta Raileanu, Maxwell Goldstein, Denis Yarats, Ilya Kostrikov, Rob Fergus
Deep reinforcement learning (RL) agents often fail to generalize beyond their training environments.
no code implementations • 29 Nov 2021 • Bogdan Mazoure, Ilya Kostrikov, Ofir Nachum, Jonathan Tompson
We show that performance of online algorithms for generalization in RL can be hindered in the offline setting due to poor estimation of similarity between observations.
13 code implementations • 12 Oct 2021 • Ilya Kostrikov, Ashvin Nair, Sergey Levine
The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.
1 code implementation • ICLR 2022 • Ilya Kostrikov, Ashvin Nair, Sergey Levine
The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.
no code implementations • ICLR 2022 • Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine
These methods, which we collectively refer to as reinforcement learning via supervised learning (RvS), involve a number of design decisions, such as policy architectures and how the conditioning variable is constructed.
2 code implementations • 14 Mar 2021 • Ilya Kostrikov, Jonathan Tompson, Rob Fergus, Ofir Nachum
Many modern approaches to offline Reinforcement Learning (RL) utilize behavior regularization, typically augmenting a model-free actor critic algorithm with a penalty measuring divergence of the policy from the offline data.
no code implementations • 27 Jul 2020 • Ilya Kostrikov, Ofir Nachum
In reinforcement learning, it is typical to use the empirically observed transitions and rewards to estimate the value of a policy via either model-based or Q-fitting approaches.
1 code implementation • NeurIPS 2021 • Roberta Raileanu, Max Goldstein, Denis Yarats, Ilya Kostrikov, Rob Fergus
Our agent outperforms other baselines specifically designed to improve generalization in RL.
4 code implementations • ICLR 2021 • Ilya Kostrikov, Denis Yarats, Rob Fergus
We propose a simple data augmentation technique that can be applied to standard model-free reinforcement learning algorithms, enabling robust learning directly from pixels without the need for auxiliary losses or pre-training.
Ranked #1 on
Continuous Control
on DeepMind Walker Walk (Images)
2 code implementations • ICLR 2020 • Ilya Kostrikov, Ofir Nachum, Jonathan Tompson
In this work, we show how the original distribution ratio estimation objective may be transformed in a principled manner to yield a completely off-policy objective.
no code implementations • 4 Dec 2019 • Ofir Nachum, Bo Dai, Ilya Kostrikov, Yin-Lam Chow, Lihong Li, Dale Schuurmans
In many real-world applications of reinforcement learning (RL), interactions with the environment are limited due to cost or feasibility.
3 code implementations • 2 Oct 2019 • Denis Yarats, Amy Zhang, Ilya Kostrikov, Brandon Amos, Joelle Pineau, Rob Fergus
A promising approach is to learn a latent representation together with the control policy.
3 code implementations • ICLR 2019 • Ilya Kostrikov, Kumar Krishna Agrawal, Debidatta Dwibedi, Sergey Levine, Jonathan Tompson
We identify two issues with the family of algorithms based on the Adversarial Imitation Learning framework.
1 code implementation • CVPR 2018 • Ilya Kostrikov, Zhongshi Jiang, Daniele Panozzo, Denis Zorin, Joan Bruna
We study data-driven representations for three-dimensional triangle meshes, which are one of the prevalent objects used to represent 3D geometry.
3 code implementations • ICLR 2018 • Sainbayar Sukhbaatar, Zeming Lin, Ilya Kostrikov, Gabriel Synnaeve, Arthur Szlam, Rob Fergus
When Bob is deployed on an RL task within the environment, this unsupervised training reduces the number of supervised episodes needed to learn, and in some cases converges to a higher reward.
1 code implementation • 17 Feb 2016 • Tobias Weyand, Ilya Kostrikov, James Philbin
Is it possible to build a system to determine the location where a photo was taken using just its pixels?
Ranked #7 on
Photo geolocation estimation
on Im2GPS
(using extra training data)
no code implementations • In Proceedings British Machine Vision Conference 2014 (BMVC 2014) 2014 • Ilya Kostrikov, Juergen Gall
We address the problem of estimating the 3d pose from monocular images.
Ranked #24 on
3D Human Pose Estimation
on HumanEva-I
no code implementations • CVPR 2014 • Ilya Kostrikov, Esther Horbert, Bastian Leibe
In this paper, we propose a novel labeling cost for multi- view reconstruction.