Robust Reinforcement Learning via Adversarial training with Langevin Dynamics

ICLR 2020 Parameswaran KamalarubanYu-Ting HuangYa-Ping HsiehPaul RollandCheng ShiVolkan Cevher

We introduce a sampling perspective to tackle the challenging task of training robust Reinforcement Learning (RL) agents. Leveraging the powerful Stochastic Gradient Langevin Dynamics, we present a novel, scalable two-player RL algorithm, which is a sampling variant of the two-player policy gradient method... (read more)

PDF Abstract


No code implementations yet. Submit your code now


Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.