Stabilizing Transformers for Reinforcement Learning

ICLR 2020 Emilio ParisottoH. Francis SongJack W. RaeRazvan PascanuCaglar GulcehreSiddhant M. JayakumarMax JaderbergRaphael Lopez KaufmanAidan ClarkSeb NouryMatthew M. BotvinickNicolas HeessRaia Hadsell

Owing to their ability to both effectively integrate information over long time horizons and scale to massive amounts of data, self-attention architectures have recently shown breakthrough success in natural language processing (NLP), achieving state-of-the-art results in domains such as language modeling and machine translation. Harnessing the transformer's ability to process long time horizons of information could provide a similar performance boost in partially observable reinforcement learning (RL) domains, but the large-scale transformers used in NLP have yet to be successfully applied to the RL setting... (read more)

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper