TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Skill Generalization	RGB-Stacking	Gato	Group 1	24.5	# 1
Skill Generalization	RGB-Stacking	Gato	Group 2	33	# 2
Skill Generalization	RGB-Stacking	Gato	Group 3	50.5	# 1
Skill Generalization	RGB-Stacking	Gato	Group 4	76.5	# 2
Skill Generalization	RGB-Stacking	Gato	Group 5	66.5	# 1
Skill Generalization	RGB-Stacking	Gato	Average	50.2	# 1
Skill Mastery	RGB-Stacking	Gato	Group 1	58	# 2
Skill Mastery	RGB-Stacking	Gato	Group 2	57.6	# 2
Skill Mastery	RGB-Stacking	Gato	Group 3	78.5	# 1
Skill Mastery	RGB-Stacking	Gato	Group 4	89	# 1
Skill Mastery	RGB-Stacking	Gato	Group 5	95.1	# 1
Skill Mastery	RGB-Stacking	Gato	Average	75.6	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-generalist-agent/skill-generalization-on-rgb-stacking)](https://paperswithcode.com/sota/skill-generalization-on-rgb-stacking?p=a-generalist-agent)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-generalist-agent/skill-mastery-on-rgb-stacking)](https://paperswithcode.com/sota/skill-mastery-on-rgb-stacking?p=a-generalist-agent)`

A Generalist Agent

DeepMind 2022 · Scott Reed, Konrad Zolna, Emilio Parisotto, Sergio Gomez Colmenarejo, Alexander Novikov, Gabriel Barth-Maron, Mai Gimenez, Yury Sulsky, Jackie Kay, Jost Tobias Springenberg, Tom Eccles, Jake Bruce, Ali Razavi, Ashley Edwards, Nicolas Heess, Yutian Chen, Raia Hadsell, Oriol Vinyals, Mahyar Bordbar, Nando de Freitas ·

Inspired by progress in large-scale language modeling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy. The same network with the same weights can play Atari, caption images, chat, stack blocks with a real robot arm and much more, deciding based on its context whether to output text, joint torques, button presses, or other tokens. In this report we describe the model and the data, and document the current capabilities of Gato.

PDF Abstract DeepMind 2022 PDF

Code

Add Remove Mark official

OrigamiDream/gato

187

LAS1520/Gato-A-Generalist-Agent

ManifoldRG/gato-control

↳ Quickstart in

Colab

Tasks

Add Remove

Language Modelling

Skill Generalization

Skill Mastery

Datasets

Arcade Learning Environment

Conceptual Captions

OK-VQA

COCO Captions

ProcGen MassiveText

RGB-Stacking

Results from the Paper

Edit

Ranked #1 on Skill Generalization on RGB-Stacking

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Skill Generalization	RGB-Stacking	Gato	Group 1	24.5	# 1	Compare
			Group 2	33	# 2	Compare
			Group 3	50.5	# 1	Compare
			Group 4	76.5	# 2	Compare
			Group 5	66.5	# 1	Compare
			Average	50.2	# 1	Compare
Skill Mastery	RGB-Stacking	Gato	Group 1	58	# 2	Compare
			Group 2	57.6	# 2	Compare
			Group 3	78.5	# 1	Compare
			Group 4	89	# 1	Compare
			Group 5	95.1	# 1	Compare
			Average	75.6	# 1	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

A Generalist Agent

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove