7 dataset results for Systematic Generalization

SCAN (Simplified versions of the CommAI Navigation tasks)

SCAN is a dataset for grounded navigation which consists of a set of simple compositional navigation commands paired with the corresponding action sequences.

136 PAPERS • NO BENCHMARKS YET

GSCAN

GSCAN (Grounded SCAN)

Grounded SCAN poses a simple task, where an agent must execute action sequences based on a synthetic language instruction.

20 PAPERS • NO BENCHMARKS YET

Mathematics Dataset

This dataset code generates mathematical question and answer pairs, from a range of question types at roughly school-level difficulty. This is designed to test the mathematical learning and algebraic reasoning skills of learning models.

18 PAPERS • 1 BENCHMARK

ZEST

A new English language dataset structured for task-oriented evaluation on unseen tasks.

3 PAPERS • NO BENCHMARKS YET

PCFG SET (Probabilistic Context Free Grammar String Edit Task)

The Probabilistic Context Free Grammar String Edit Task (PCFG SET) dataset is a dataset with sequence to sequence problems specifically designed to test different aspects of compositional generalisation. In particular, the dataset contains splits to test for systematicity, productivity, substitutivity, localism and overgeneralisation.

2 PAPERS • NO BENCHMARKS YET

Cryptics

Official dataset of Decrypting Cryptic Crosswords: Semantically Complex Wordplay Puzzles as a Target for NLP.

1 PAPER • NO BENCHMARKS YET

S2B (Symbolic Behaviour Benchmark)

Suite of OpenAI Gym-compatible multi-agent reinforcement learning environment centered around meta-referential games to benchmark for behavioral traits pertaining to symbolic behaviours, as described in Santoro et al., 2021, "Symbolic Behaviours in Artificial Intelligence", with a primary focus on the following behavioural traits:

1 PAPER • NO BENCHMARKS YET

Datasets

7 dataset results for Systematic Generalization