no code implementations • 27 Aug 2020 • Audrūnas Gruslys, Marc Lanctot, Rémi Munos, Finbarr Timbers, Martin Schmid, Julien Perolat, Dustin Morrill, Vinicius Zambaldi, Jean-Baptiste Lespiau, John Schultz, Mohammad Gheshlaghi Azar, Michael Bowling, Karl Tuyls
In this paper, we describe a general model-free RL method for no-regret learning based on repeated reconsideration of past behavior.
2 code implementations • NeurIPS 2016 • Audrūnas Gruslys, Remi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves
We propose a novel approach to reduce memory consumption of the backpropagation through time (BPTT) algorithm when training recurrent neural networks (RNNs).