The Atari 2600 Games task (and dataset) involves training an agent to achieve high game scores.
( Image credit: Playing Atari with Deep Reinforcement Learning )
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
The output of the feedback neural network is converted to a shaping reward that is augmented to the reward provided by the environment.
Sample efficiency has been one of the major challenges for deep reinforcement learning.
From a young age humans learn to use grammatical principles to hierarchically combine words into sentences.
We propose a new perspective on adversarial attacks against deep reinforcement learning agents.
We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting.