Phy-Q

Introduced by Xue et al. in Phy-Q as a measure for physical reasoning intelligence

Phy-Q is a benchmark that requires an agent to reason about physical scenarios and take an action accordingly. Inspired by the physical knowledge acquired in infancy and the capabilities required for robots to operate in real-world environments, the authors identify 15 essential physical scenarios. For each scenario, a wide variety of distinct task templates are created, and all the task templates within the same scenario can be solved by using one specific physical rule.

By having such a design, two distinct levels of generalization can be evaluated, namely the local generalization and the broad generalization. The benchmark gives a Phy-Q (physical reasoning quotient) score that reflects the physical reasoning ability of the agents.

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

Tasks

Similar Datasets

TRANCE

CRIPP-VQA

NovelCraft

Phy-Q

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

TRANCE

CRIPP-VQA

NovelCraft

PHYRE

Usage

License

Modalities

Languages

Phy-Q

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

TRANCE

CRIPP-VQA

NovelCraft

PHYRE

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages