Search Results for author: Ilya Kostrikov

Found 28 papers, 20 papers with code

Training Diffusion Models with Reinforcement Learning

2 code implementations • 22 May 2023 • Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, Sergey Levine

However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives such as human-perceived image quality or drug effectiveness.

Decision Making Denoising +2

319

Paper
Code

IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies

1 code implementation • 20 Apr 2023 • Philippe Hansen-Estruch, Ilya Kostrikov, Michael Janner, Jakub Grudzien Kuba, Sergey Levine

In this paper, we reinterpret IQL as an actor-critic method by generalizing the critic objective and connecting it to a behavior-regularized implicit actor.

Offline RL Q-Learning

Paper
Code

Efficient Deep Reinforcement Learning Requires Regulating Overfitting

no code implementations • 20 Apr 2023 • Qiyang Li, Aviral Kumar, Ilya Kostrikov, Sergey Levine

Deep reinforcement learning algorithms that learn policies by trial-and-error must learn from limited amounts of data collected by actively interacting with the environment.

Model Selection reinforcement-learning

Paper
Add Code

FastRLAP: A System for Learning High-Speed Driving via Deep RL and Autonomous Practicing

no code implementations • 19 Apr 2023 • Kyle Stachowicz, Dhruv Shah, Arjun Bhorkar, Ilya Kostrikov, Sergey Levine

We present a system that enables an autonomous small-scale RC car to drive aggressively from visual observations using reinforcement learning (RL).

Reinforcement Learning (RL)

Paper
Add Code

Efficient Online Reinforcement Learning with Offline Data

1 code implementation • 6 Feb 2023 • Philip J. Ball, Laura Smith, Ilya Kostrikov, Sergey Levine

Sample efficiency and exploration remain major challenges in online reinforcement learning (RL).

reinforcement-learning Reinforcement Learning (RL)

176

Paper
Code

Offline Reinforcement Learning for Visual Navigation

1 code implementation • 16 Dec 2022 • Dhruv Shah, Arjun Bhorkar, Hrish Leen, Ilya Kostrikov, Nick Rhinehart, Sergey Levine

Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass.

Navigate Offline RL +3

Paper
Code

A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free Reinforcement Learning

1 code implementation • 16 Aug 2022 • Laura Smith, Ilya Kostrikov, Sergey Levine

Deep reinforcement learning is a promising approach to learning policies in uncontrolled environments that do not require domain knowledge.

reinforcement-learning Reinforcement Learning (RL)

233

Paper
Code

Offline RL for Natural Language Generation with Implicit Language Q Learning

1 code implementation • 5 Jun 2022 • Charlie Snell, Ilya Kostrikov, Yi Su, Mengjiao Yang, Sergey Levine

Large language models distill broad knowledge from text corpora.

Language Modelling Offline RL +2

189

Paper
Code

In Defense of the Unitary Scalarization for Deep Multi-Task Learning

1 code implementation • 11 Jan 2022 • Vitaly Kurin, Alessandro De Palma, Ilya Kostrikov, Shimon Whiteson, M. Pawan Kumar

We show that unitary scalarization, coupled with standard regularization and stabilization techniques from single-task learning, matches or improves upon the performance of complex multi-task optimizers in popular supervised and reinforcement learning settings.

Multi-Task Learning Reinforcement Learning (RL)

Paper
Code

RvS: What is Essential for Offline RL via Supervised Learning?

1 code implementation • 20 Dec 2021 • Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine

Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL.

Offline RL

Paper
Code

Automatic Data Augmentation for Generalization in Reinforcement Learning

1 code implementation • NeurIPS 2021 • Roberta Raileanu, Maxwell Goldstein, Denis Yarats, Ilya Kostrikov, Rob Fergus

Deep reinforcement learning (RL) agents often fail to generalize beyond their training environments.

Data Augmentation reinforcement-learning +1

102

Paper
Code

Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions

no code implementations • 29 Nov 2021 • Bogdan Mazoure, Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

We show that performance of online algorithms for generalization in RL can be hindered in the offline setting due to poor estimation of similarity between observations.

Contrastive Learning Decision Making +5

Paper
Add Code

Offline Reinforcement Learning with Implicit Q-Learning

15 code implementations • 12 Oct 2021 • Ilya Kostrikov, Ashvin Nair, Sergey Levine

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

D4RL Offline RL +3

2,505

Paper
Code

Offline Reinforcement Learning with In-sample Q-Learning

1 code implementation • ICLR 2022 • Ilya Kostrikov, Ashvin Nair, Sergey Levine

D4RL Offline RL +3

Paper
Code

The Essential Elements of Offline RL via Supervised Learning

no code implementations • ICLR 2022 • Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine

These methods, which we collectively refer to as reinforcement learning via supervised learning (RvS), involve a number of design decisions, such as policy architectures and how the conditioning variable is constructed.

Offline RL reinforcement-learning +1

Paper
Add Code

Offline Reinforcement Learning with Fisher Divergence Critic Regularization

2 code implementations • 14 Mar 2021 • Ilya Kostrikov, Jonathan Tompson, Rob Fergus, Ofir Nachum

Many modern approaches to offline Reinforcement Learning (RL) utilize behavior regularization, typically augmenting a model-free actor critic algorithm with a penalty measuring divergence of the policy from the offline data.

Offline RL reinforcement-learning +1

32,745

Paper
Code

Statistical Bootstrapping for Uncertainty Estimation in Off-Policy Evaluation

no code implementations • 27 Jul 2020 • Ilya Kostrikov, Ofir Nachum

In reinforcement learning, it is typical to use the empirically observed transitions and rewards to estimate the value of a policy via either model-based or Q-fitting approaches.

Continuous Control Off-policy evaluation

Paper
Add Code

Automatic Data Augmentation for Generalization in Deep Reinforcement Learning

1 code implementation • NeurIPS 2021 • Roberta Raileanu, Max Goldstein, Denis Yarats, Ilya Kostrikov, Rob Fergus

Our agent outperforms other baselines specifically designed to improve generalization in RL.

Data Augmentation reinforcement-learning +1

102

Paper
Code

Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels

4 code implementations • ICLR 2021 • Ilya Kostrikov, Denis Yarats, Rob Fergus

We propose a simple data augmentation technique that can be applied to standard model-free reinforcement learning algorithms, enabling robust learning directly from pixels without the need for auxiliary losses or pre-training.

Ranked #1 on Continuous Control on DeepMind Walker Walk (Images)

Atari Games 100k Continuous Control +4

398

Paper
Code

Imitation Learning via Off-Policy Distribution Matching

3 code implementations • ICLR 2020 • Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

In this work, we show how the original distribution ratio estimation objective may be transformed in a principled manner to yield a completely off-policy objective.

Imitation Learning Reinforcement Learning (RL)

32,745

Paper
Code

AlgaeDICE: Policy Gradient from Arbitrary Experience

no code implementations • 4 Dec 2019 • Ofir Nachum, Bo Dai, Ilya Kostrikov, Yin-Lam Chow, Lihong Li, Dale Schuurmans

In many real-world applications of reinforcement learning (RL), interactions with the environment are limited due to cost or feasibility.

Reinforcement Learning (RL)

Paper
Add Code

Improving Sample Efficiency in Model-Free Reinforcement Learning from Images

3 code implementations • 2 Oct 2019 • Denis Yarats, Amy Zhang, Ilya Kostrikov, Brandon Amos, Joelle Pineau, Rob Fergus

A promising approach is to learn a latent representation together with the control policy.

Image Reconstruction reinforcement-learning +2

207

Paper
Code

Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning

3 code implementations • ICLR 2019 • Ilya Kostrikov, Kumar Krishna Agrawal, Debidatta Dwibedi, Sergey Levine, Jonathan Tompson

We identify two issues with the family of algorithms based on the Adversarial Imitation Learning framework.

Imitation Learning

385

Paper
Code

Surface Networks

1 code implementation • CVPR 2018 • Ilya Kostrikov, Zhongshi Jiang, Daniele Panozzo, Denis Zorin, Joan Bruna

We study data-driven representations for three-dimensional triangle meshes, which are one of the prevalent objects used to represent 3D geometry.

Paper
Code

Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play

3 code implementations • ICLR 2018 • Sainbayar Sukhbaatar, Zeming Lin, Ilya Kostrikov, Gabriel Synnaeve, Arthur Szlam, Rob Fergus

When Bob is deployed on an RL task within the environment, this unsupervised training reduces the number of supervised episodes needed to learn, and in some cases converges to a higher reward.