Search Results for author: Ilya Kostrikov

Found 28 papers, 20 papers with code

Training Diffusion Models with Reinforcement Learning

2 code implementations22 May 2023 Kevin Black, Michael Janner, Yilun Du, Ilya Kostrikov, Sergey Levine

However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives such as human-perceived image quality or drug effectiveness.

Decision Making Denoising +2

IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies

1 code implementation20 Apr 2023 Philippe Hansen-Estruch, Ilya Kostrikov, Michael Janner, Jakub Grudzien Kuba, Sergey Levine

In this paper, we reinterpret IQL as an actor-critic method by generalizing the critic objective and connecting it to a behavior-regularized implicit actor.

Offline RL Q-Learning

Efficient Deep Reinforcement Learning Requires Regulating Overfitting

no code implementations20 Apr 2023 Qiyang Li, Aviral Kumar, Ilya Kostrikov, Sergey Levine

Deep reinforcement learning algorithms that learn policies by trial-and-error must learn from limited amounts of data collected by actively interacting with the environment.

Model Selection reinforcement-learning

FastRLAP: A System for Learning High-Speed Driving via Deep RL and Autonomous Practicing

no code implementations19 Apr 2023 Kyle Stachowicz, Dhruv Shah, Arjun Bhorkar, Ilya Kostrikov, Sergey Levine

We present a system that enables an autonomous small-scale RC car to drive aggressively from visual observations using reinforcement learning (RL).

Reinforcement Learning (RL)

Offline Reinforcement Learning for Visual Navigation

1 code implementation16 Dec 2022 Dhruv Shah, Arjun Bhorkar, Hrish Leen, Ilya Kostrikov, Nick Rhinehart, Sergey Levine

Reinforcement learning can enable robots to navigate to distant goals while optimizing user-specified reward functions, including preferences for following lanes, staying on paved paths, or avoiding freshly mowed grass.

Navigate Offline RL +3

A Walk in the Park: Learning to Walk in 20 Minutes With Model-Free Reinforcement Learning

1 code implementation16 Aug 2022 Laura Smith, Ilya Kostrikov, Sergey Levine

Deep reinforcement learning is a promising approach to learning policies in uncontrolled environments that do not require domain knowledge.

reinforcement-learning Reinforcement Learning (RL)

In Defense of the Unitary Scalarization for Deep Multi-Task Learning

1 code implementation11 Jan 2022 Vitaly Kurin, Alessandro De Palma, Ilya Kostrikov, Shimon Whiteson, M. Pawan Kumar

We show that unitary scalarization, coupled with standard regularization and stabilization techniques from single-task learning, matches or improves upon the performance of complex multi-task optimizers in popular supervised and reinforcement learning settings.

Multi-Task Learning Reinforcement Learning (RL)

RvS: What is Essential for Offline RL via Supervised Learning?

1 code implementation20 Dec 2021 Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine

Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL.

Offline RL

Improving Zero-shot Generalization in Offline Reinforcement Learning using Generalized Similarity Functions

no code implementations29 Nov 2021 Bogdan Mazoure, Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

We show that performance of online algorithms for generalization in RL can be hindered in the offline setting due to poor estimation of similarity between observations.

Contrastive Learning Decision Making +5

Offline Reinforcement Learning with Implicit Q-Learning

15 code implementations12 Oct 2021 Ilya Kostrikov, Ashvin Nair, Sergey Levine

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

D4RL Offline RL +3

Offline Reinforcement Learning with In-sample Q-Learning

1 code implementation ICLR 2022 Ilya Kostrikov, Ashvin Nair, Sergey Levine

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

D4RL Offline RL +3

The Essential Elements of Offline RL via Supervised Learning

no code implementations ICLR 2022 Scott Emmons, Benjamin Eysenbach, Ilya Kostrikov, Sergey Levine

These methods, which we collectively refer to as reinforcement learning via supervised learning (RvS), involve a number of design decisions, such as policy architectures and how the conditioning variable is constructed.

Offline RL reinforcement-learning +1

Offline Reinforcement Learning with Fisher Divergence Critic Regularization

2 code implementations14 Mar 2021 Ilya Kostrikov, Jonathan Tompson, Rob Fergus, Ofir Nachum

Many modern approaches to offline Reinforcement Learning (RL) utilize behavior regularization, typically augmenting a model-free actor critic algorithm with a penalty measuring divergence of the policy from the offline data.

Offline RL reinforcement-learning +1

Statistical Bootstrapping for Uncertainty Estimation in Off-Policy Evaluation

no code implementations27 Jul 2020 Ilya Kostrikov, Ofir Nachum

In reinforcement learning, it is typical to use the empirically observed transitions and rewards to estimate the value of a policy via either model-based or Q-fitting approaches.

Continuous Control Off-policy evaluation

Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels

4 code implementations ICLR 2021 Ilya Kostrikov, Denis Yarats, Rob Fergus

We propose a simple data augmentation technique that can be applied to standard model-free reinforcement learning algorithms, enabling robust learning directly from pixels without the need for auxiliary losses or pre-training.

Atari Games 100k Continuous Control +4

Imitation Learning via Off-Policy Distribution Matching

3 code implementations ICLR 2020 Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

In this work, we show how the original distribution ratio estimation objective may be transformed in a principled manner to yield a completely off-policy objective.

Imitation Learning Reinforcement Learning (RL)

AlgaeDICE: Policy Gradient from Arbitrary Experience

no code implementations4 Dec 2019 Ofir Nachum, Bo Dai, Ilya Kostrikov, Yin-Lam Chow, Lihong Li, Dale Schuurmans

In many real-world applications of reinforcement learning (RL), interactions with the environment are limited due to cost or feasibility.

Reinforcement Learning (RL)

Surface Networks

1 code implementation CVPR 2018 Ilya Kostrikov, Zhongshi Jiang, Daniele Panozzo, Denis Zorin, Joan Bruna

We study data-driven representations for three-dimensional triangle meshes, which are one of the prevalent objects used to represent 3D geometry.

Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play

3 code implementations ICLR 2018 Sainbayar Sukhbaatar, Zeming Lin, Ilya Kostrikov, Gabriel Synnaeve, Arthur Szlam, Rob Fergus

When Bob is deployed on an RL task within the environment, this unsupervised training reduces the number of supervised episodes needed to learn, and in some cases converges to a higher reward.

PlaNet - Photo Geolocation with Convolutional Neural Networks

1 code implementation17 Feb 2016 Tobias Weyand, Ilya Kostrikov, James Philbin

Is it possible to build a system to determine the location where a photo was taken using just its pixels?

 Ranked #1 on Photo geolocation estimation on Im2GPS (Reference images metric)

Image Retrieval Photo geolocation estimation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.