Search Results for author: Georg Ostrovski

Found 17 papers, 9 papers with code

An Analysis of Quantile Temporal-Difference Learning

no code implementations11 Jan 2023 Mark Rowland, Rémi Munos, Mohammad Gheshlaghi Azar, Yunhao Tang, Georg Ostrovski, Anna Harutyunyan, Karl Tuyls, Marc G. Bellemare, Will Dabney

We analyse quantile temporal-difference learning (QTD), a distributional reinforcement learning algorithm that has proven to be a key component in several successful large-scale applications of reinforcement learning.

Distributional Reinforcement Learning reinforcement-learning +1

An Empirical Study of Implicit Regularization in Deep Offline RL

no code implementations5 Jul 2022 Caglar Gulcehre, Srivatsan Srinivasan, Jakub Sygnowski, Georg Ostrovski, Mehrdad Farajtabar, Matt Hoffman, Razvan Pascanu, Arnaud Doucet

Also, we empirically identify three phases of learning that explain the impact of implicit regularization on the learning dynamics and found that bootstrapping alone is insufficient to explain the collapse of the effective rank.

Offline RL

The Phenomenon of Policy Churn

no code implementations1 Jun 2022 Tom Schaul, André Barreto, John Quan, Georg Ostrovski

We identify and study the phenomenon of policy churn, that is, the rapid change of the greedy policy in value-based reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

The Difficulty of Passive Learning in Deep Reinforcement Learning

1 code implementation NeurIPS 2021 Georg Ostrovski, Pablo Samuel Castro, Will Dabney

Learning to act from observational data without active environmental interaction is a well-known challenge in Reinforcement Learning (RL).

reinforcement-learning Reinforcement Learning (RL)

On The Effect of Auxiliary Tasks on Representation Dynamics

no code implementations25 Feb 2021 Clare Lyle, Mark Rowland, Georg Ostrovski, Will Dabney

While auxiliary tasks play a key role in shaping the representations learnt by reinforcement learning agents, much is still unknown about the mechanisms through which this is achieved.

reinforcement-learning Reinforcement Learning (RL)

Temporally-Extended ε-Greedy Exploration

no code implementations ICLR 2021 Will Dabney, Georg Ostrovski, André Barreto

Recent work on exploration in reinforcement learning (RL) has led to a series of increasingly complex solutions to the problem.

Reinforcement Learning (RL)

Adapting Behaviour for Learning Progress

no code implementations14 Dec 2019 Tom Schaul, Diana Borsa, David Ding, David Szepesvari, Georg Ostrovski, Will Dabney, Simon Osindero

Determining what experience to generate to best facilitate learning (i. e. exploration) is one of the distinguishing features and open challenges in reinforcement learning.

Atari Games

Recurrent Experience Replay in Distributed Reinforcement Learning

3 code implementations ICLR 2019 Steven Kapturowski, Georg Ostrovski, Will Dabney, John Quan, Remi Munos

Using a single network architecture and fixed set of hyperparameters, the resulting agent, Recurrent Replay Distributed DQN, quadruples the previous state of the art on Atari-57, and surpasses the state of the art on DMLab-30.

Atari Games reinforcement-learning +1

Implicit Quantile Networks for Distributional Reinforcement Learning

20 code implementations ICML 2018 Will Dabney, Georg Ostrovski, David Silver, Rémi Munos

In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN.

Atari Games Distributional Reinforcement Learning +3

Autoregressive Quantile Networks for Generative Modeling

1 code implementation ICML 2018 Georg Ostrovski, Will Dabney, Rémi Munos

We introduce autoregressive implicit quantile networks (AIQN), a fundamentally different approach to generative modeling than those commonly used, that implicitly captures the distribution using quantile regression.


Count-Based Exploration with Neural Density Models

1 code implementation ICML 2017 Georg Ostrovski, Marc G. Bellemare, Aaron van den Oord, Remi Munos

This pseudo-count was used to generate an exploration bonus for a DQN agent and combined with a mixed Monte Carlo update was sufficient to achieve state of the art on the Atari 2600 game Montezuma's Revenge.

Montezuma's Revenge

Increasing the Action Gap: New Operators for Reinforcement Learning

2 code implementations15 Dec 2015 Marc G. Bellemare, Georg Ostrovski, Arthur Guez, Philip S. Thomas, Rémi Munos

Extending the idea of a locally consistent operator, we then derive sufficient conditions for an operator to preserve optimality, leading to a family of operators which includes our consistent Bellman operator.

Atari Games Q-Learning +2

Human level control through deep reinforcement learning

7 code implementations25 Feb 2015 Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg1 & Demis Hassabis

We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters.

Atari Games reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.