Search Results for author: Tom Van de Wiele

Found 4 papers, 1 papers with code

Q-Learning in enormous action spaces via amortized approximate maximization

no code implementations22 Jan 2020 Tom Van de Wiele, David Warde-Farley, andriy mnih, Volodymyr Mnih

Applying Q-learning to high-dimensional or continuous action spaces can be difficult due to the required maximization over the set of possible actions.

Continuous Control Q-Learning

Fast Task Inference with Variational Intrinsic Successor Features

no code implementations ICLR 2020 Steven Hansen, Will Dabney, Andre Barreto, Tom Van de Wiele, David Warde-Farley, Volodymyr Mnih

It has been established that diverse behaviors spanning the controllable subspace of an Markov decision process can be trained by rewarding a policy for being distinguishable from other policies \citep{gregor2016variational, eysenbach2018diversity, warde2018unsupervised}.

Unsupervised Control Through Non-Parametric Discriminative Rewards

no code implementations ICLR 2019 David Warde-Farley, Tom Van de Wiele, tejas kulkarni, Catalin Ionescu, Steven Hansen, Volodymyr Mnih

Learning to control an environment without hand-crafted rewards or expert data remains challenging and is at the frontier of reinforcement learning research.

reinforcement-learning

Cannot find the paper you are looking for? You can Submit a new open access paper.