Search Results for author: Brendan Maginnis

Found 5 papers, 0 papers with code

Diffusing Policies : Towards Wasserstein Policy Gradient Flows

no code implementations ICLR 2018 Pierre H. Richemond, Brendan Maginnis

We derive policy gradients where the change in policy is limited to a small Wasserstein distance (or trust region).

Representing Entropy : A short proof of the equivalence between soft Q-learning and policy gradients

no code implementations ICLR 2018 Pierre H. Richemond, Brendan Maginnis

Two main families of reinforcement learning algorithms, Q-learning and policy gradients, have recently been proven to be equivalent when using a softmax relaxation on one part, and an entropic regularization on the other.

Q-Learning reinforcement-learning +1

A short variational proof of equivalence between policy gradients and soft Q learning

no code implementations22 Dec 2017 Pierre H. Richemond, Brendan Maginnis

Two main families of reinforcement learning algorithms, Q-learning and policy gradients, have recently been proven to be equivalent when using a softmax relaxation on one part, and an entropic regularization on the other.

Q-Learning reinforcement-learning +1

On Wasserstein Reinforcement Learning and the Fokker-Planck equation

no code implementations19 Dec 2017 Pierre H. Richemond, Brendan Maginnis

We derive policy gradients where the change in policy is limited to a small Wasserstein distance (or trust region).

reinforcement-learning Reinforcement Learning (RL)

Efficiently applying attention to sequential data with the Recurrent Discounted Attention unit

no code implementations ICLR 2018 Brendan Maginnis, Pierre H. Richemond

On tasks with a single output the RWA, RDA and GRU units learn much quicker than the LSTM and with better performance.

Cannot find the paper you are looking for? You can Submit a new open access paper.