Search Results for author: Ilya Kostrikov

Found 15 papers, 10 papers with code

Offline Reinforcement Learning with Implicit Q-Learning

2 code implementations12 Oct 2021 Ilya Kostrikov, Ashvin Nair, Sergey Levine

The main insight in our work is that, instead of evaluating unseen actions from the latest policy, we can approximate the policy improvement step implicitly by treating the state value function as a random variable, with randomness determined by the action (while still integrating over the dynamics to avoid excessive optimism), and then taking a state conditional upper expectile of this random variable to estimate the value of the best actions in that state.

Offline RL Q-Learning

Offline Reinforcement Learning with Fisher Divergence Critic Regularization

1 code implementation14 Mar 2021 Ilya Kostrikov, Jonathan Tompson, Rob Fergus, Ofir Nachum

Many modern approaches to offline Reinforcement Learning (RL) utilize behavior regularization, typically augmenting a model-free actor critic algorithm with a penalty measuring divergence of the policy from the offline data.

Offline RL

Automatic Data Augmentation for Generalization in Reinforcement Learning

no code implementations1 Jan 2021 Roberta Raileanu, Maxwell Goldstein, Denis Yarats, Ilya Kostrikov, Rob Fergus

These are combined with two novel regularization terms for the policy and value function, required to make the use of data augmentation theoretically sound for actor-critic algorithms.

Data Augmentation

Statistical Bootstrapping for Uncertainty Estimation in Off-Policy Evaluation

no code implementations27 Jul 2020 Ilya Kostrikov, Ofir Nachum

In reinforcement learning, it is typical to use the empirically observed transitions and rewards to estimate the value of a policy via either model-based or Q-fitting approaches.

Continuous Control

Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels

1 code implementation ICLR 2021 Ilya Kostrikov, Denis Yarats, Rob Fergus

We propose a simple data augmentation technique that can be applied to standard model-free reinforcement learning algorithms, enabling robust learning directly from pixels without the need for auxiliary losses or pre-training.

Continuous Control Contrastive Learning +1

Imitation Learning via Off-Policy Distribution Matching

1 code implementation ICLR 2020 Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

In this work, we show how the original distribution ratio estimation objective may be transformed in a principled manner to yield a completely off-policy objective.

Imitation Learning

AlgaeDICE: Policy Gradient from Arbitrary Experience

no code implementations4 Dec 2019 Ofir Nachum, Bo Dai, Ilya Kostrikov, Yin-Lam Chow, Lihong Li, Dale Schuurmans

In many real-world applications of reinforcement learning (RL), interactions with the environment are limited due to cost or feasibility.

Surface Networks

1 code implementation CVPR 2018 Ilya Kostrikov, Zhongshi Jiang, Daniele Panozzo, Denis Zorin, Joan Bruna

We study data-driven representations for three-dimensional triangle meshes, which are one of the prevalent objects used to represent 3D geometry.

Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play

3 code implementations ICLR 2018 Sainbayar Sukhbaatar, Zeming Lin, Ilya Kostrikov, Gabriel Synnaeve, Arthur Szlam, Rob Fergus

When Bob is deployed on an RL task within the environment, this unsupervised training reduces the number of supervised episodes needed to learn, and in some cases converges to a higher reward.

PlaNet - Photo Geolocation with Convolutional Neural Networks

1 code implementation17 Feb 2016 Tobias Weyand, Ilya Kostrikov, James Philbin

Is it possible to build a system to determine the location where a photo was taken using just its pixels?

Ranked #7 on Photo geolocation estimation on Im2GPS (using extra training data)

Image Retrieval Photo geolocation estimation

Cannot find the paper you are looking for? You can Submit a new open access paper.