Policy regularization methods such as maximum entropy regularization are widely used in reinforcement learning to improve the robustness of a learned policy.
no code implementations • 19 Feb 2020 • Julien Perolat, Remi Munos, Jean-Baptiste Lespiau, Shayegan Omidshafiei, Mark Rowland, Pedro Ortega, Neil Burch, Thomas Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls
In this paper we investigate the Follow the Regularized Leader dynamics in sequential imperfect information games (IIG).
Discovering and exploiting the causal structure in the environment is a crucial challenge for intelligent agents.
We propose a novel Bayesian approach to solve stochastic optimization problems that involve ﬁnding extrema of noisy, nonlinear functions.