Spatial Reasoning
118 papers with code • 0 benchmarks • 1 datasets
Benchmarks
These leaderboards are used to track progress in Spatial Reasoning
Most implemented papers
Spatial Memory for Context Reasoning in Object Detection
On the other hand, modeling object-object relationships requires {\bf spatial} reasoning -- not only do we need a memory to store the spatial layout, but also a effective reasoning module to extract spatial patterns.
Long Range Arena: A Benchmark for Efficient Transformers
In the recent months, a wide spectrum of efficient, fast Transformers have been proposed to tackle this problem, more often than not claiming superior or comparable model quality to vanilla Transformer models.
GuessWhat?! Visual object discovery through multi-modal dialogue
Our key contribution is the collection of a large-scale dataset consisting of 150K human-played games with a total of 800K visual question-answer pairs on 66K images.
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments
We study the problem of jointly reasoning about language and vision through a navigation and spatial reasoning task.
No Blind Spots: Full-Surround Multi-Object Tracking for Autonomous Vehicles using Cameras & LiDARs
In this paper, we present a modular framework for tracking multiple objects (vehicles), capable of accepting object proposals from different sensor modalities (vision and range) and a variable number of sensors, to produce continuous object tracks.
Visual Spatial Reasoning
Spatial relations are a basic part of human cognition.
MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models
To bridge this gap, we introduce MapEval, a benchmark designed to assess diverse and complex map-based user queries with geo-spatial reasoning.
FloorNet: A Unified Framework for Floorplan Reconstruction from 3D Scans
The ultimate goal of this indoor mapping research is to automatically reconstruct a floorplan simply by walking through a house with a smartphone in a pocket.
VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions
Comprehensive visual understanding requires detection frameworks that can effectively learn and utilize object interactions while analyzing objects individually.
Learning and Reasoning with the Graph Structure Representation in Robotic Surgery
Learning to infer graph representations and performing spatial reasoning in a complex surgical environment can play a vital role in surgical scene understanding in robotic surgery.