TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Atari Games 100k	Atari 100k	SimPLe	Mean Human-Normalized Score	0.443	# 12
Atari Games 100k	Atari 100k	SimPLe	Medium Human-Normalized Score	0.144	# 16
Atari Games	Atari games	SimPLe	Mean Human Normalized Score	25.3%	# 12

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/model-based-reinforcement-learning-for-atari/atari-games-100k-on-atari-100k)](https://paperswithcode.com/sota/atari-games-100k-on-atari-100k?p=model-based-reinforcement-learning-for-atari)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/model-based-reinforcement-learning-for-atari/atari-games-on-atari-games)](https://paperswithcode.com/sota/atari-games-on-atari-games?p=model-based-reinforcement-learning-for-atari)`

Model-Based Reinforcement Learning for Atari

1 Mar 2019 · Lukasz Kaiser, Mohammad Babaeizadeh, Piotr Milos, Blazej Osinski, Roy H. Campbell, Konrad Czechowski, Dumitru Erhan, Chelsea Finn, Piotr Kozakowski, Sergey Levine, Afroz Mohiuddin, Ryan Sepassi, George Tucker, Henryk Michalewski ·

Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction -- substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k interactions between the agent and the environment, which corresponds to two hours of real-time play. In most games SimPLe outperforms state-of-the-art model-free algorithms, in some games by over an order of magnitude.

PDF Abstract

Code

Add Remove Mark official

tensorflow/tensor2tensor official

↳ Quickstart in

Colab

14,883

thomas-schillaci/SimPLe

Tasks

Add Remove

Atari Games

Atari Games 100k

Model-based Reinforcement Learning

reinforcement-learning

Reinforcement Learning (RL)

Video Prediction

Datasets

Introduced in the Paper:

Atari 100k

Used in the Paper:

Arcade Learning Environment

Results from the Paper

Edit

Ranked #12 on Atari Games 100k on Atari 100k

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Atari Games 100k	Atari 100k	SimPLe	Mean Human-Normalized Score	0.443	# 12	Compare
Atari Games 100k	Atari 100k	SimPLe	Medium Human-Normalized Score	0.144	# 16	Compare
Atari Games	Atari games	SimPLe	Mean Human Normalized Score	25.3%	# 12	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Model-Based Reinforcement Learning for Atari

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove