Spatial Reasoning

118 papers with code • 0 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

Spatial Memory for Context Reasoning in Object Detection

endernewton/tf-faster-rcnn ICCV 2017

On the other hand, modeling object-object relationships requires {\bf spatial} reasoning -- not only do we need a memory to store the spatial layout, but also a effective reasoning module to extract spatial patterns.

Long Range Arena: A Benchmark for Efficient Transformers

google-research/long-range-arena 8 Nov 2020

In the recent months, a wide spectrum of efficient, fast Transformers have been proposed to tackle this problem, more often than not claiming superior or comparable model quality to vanilla Transformer models.

GuessWhat?! Visual object discovery through multi-modal dialogue

zhanyang-nwpu/rsvg-pytorch CVPR 2017

Our key contribution is the collection of a large-scale dataset consisting of 150K human-played games with a total of 800K visual question-answer pairs on 66K images.

Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments

lil-lab/touchdown CVPR 2019

We study the problem of jointly reasoning about language and vision through a navigation and spatial reasoning task.

No Blind Spots: Full-Surround Multi-Object Tracking for Autonomous Vehicles using Cameras & LiDARs

ken-power/SensorFusionND-3D-Object-Tracking 23 Feb 2018

In this paper, we present a modular framework for tracking multiple objects (vehicles), capable of accepting object proposals from different sensor modalities (vision and range) and a variable number of sensors, to produce continuous object tracks.

Visual Spatial Reasoning

cambridgeltl/visual-spatial-reasoning 30 Apr 2022

Spatial relations are a basic part of human cognition.

MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models

MapEval/MapEval-Visual 31 Dec 2024

To bridge this gap, we introduce MapEval, a benchmark designed to assess diverse and complex map-based user queries with geo-spatial reasoning.

FloorNet: A Unified Framework for Floorplan Reconstruction from 3D Scans

art-programmer/FloorNet ECCV 2018

The ultimate goal of this indoor mapping research is to automatically reconstruct a floorplan simply by walking through a house with a smartphone in a pocket.

VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions

ASMIftekhar/VSGNet CVPR 2020

Comprehensive visual understanding requires detection frameworks that can effectively learn and utilize object interactions while analyzing objects individually.

Learning and Reasoning with the Graph Structure Representation in Robotic Surgery

mobarakol/Surgical_SceneGraph_Generation 7 Jul 2020

Learning to infer graph representations and performing spatial reasoning in a complex surgical environment can play a vital role in surgical scene understanding in robotic surgery.