Reasoning

Natural Language Visual Grounding

16 papers with code • 0 benchmarks • 6 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Natural Language Visual Grounding

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Datasets

Latest papers

Most implemented Social Latest No code

Localizing Moments in Long Video Via Multimodal Guidance

waybarrios/guidance-based-video-grounding • • ICCV 2023

In this paper, we propose a method for improving the performance of natural language grounding in long videos by identifying and pruning out non-describable windows.

26 Feb 2023

Paper
Code

Belief Revision based Caption Re-ranker with Visual Semantic Information

ahmedssabir/belief-revision-score • • COLING 2022

In this work, we focus on improving the captions generated by image-caption generation systems.

16 Sep 2022

Paper
Code

TubeDETR: Spatio-Temporal Video Grounding with Transformers

antoyang/TubeDETR • • CVPR 2022

We consider the problem of localizing a spatio-temporal tube in a video corresponding to a given text query.

157

30 Mar 2022

Paper
Code

CALVIN: A Benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks

mees/calvin • • 6 Dec 2021

We show that a baseline model based on multi-context imitation learning performs poorly on CALVIN, suggesting that there is significant room for developing innovative agents that learn to relate human language to their world models with this benchmark.

265

06 Dec 2021

Paper
Code

Panoptic Narrative Grounding

bcv-uniandes/png • • 10 Sep 2021

This paper proposes Panoptic Narrative Grounding, a spatially fine and general formulation of the natural language visual grounding problem.

10 Sep 2021

Paper
Code

Composing Pick-and-Place Tasks By Grounding Language

mees/AIS-Alexa-Robot • 16 Feb 2021

Controlling robots to perform tasks via natural language is one of the most challenging topics in human-robot interaction.

16 Feb 2021

Paper
Code

Panoptic Narrative Grounding

bcv-uniandes/png • • ICCV 2021

This paper proposes Panoptic Narrative Grounding, a spatially fine and general formulation of the natural language visual grounding problem.

01 Jan 2021

Paper
Code

ALFWorld: Aligning Text and Embodied Environments for Interactive Learning

alfworld/alfworld • • 8 Oct 2020

ALFWorld enables the creation of a new BUTLER agent whose abstract knowledge, learned in TextWorld, corresponds directly to concrete, visually grounded actions.

252

08 Oct 2020

Paper
Code

A Linguistic Analysis of Visually Grounded Dialogues Based on Spatial Expressions

Alab-NII/onecommon • • Findings of the Association for Computational Linguistics 2020

Recent models achieve promising results in visually grounded dialogues.

07 Oct 2020

Paper
Code

Learning Cross-modal Context Graph for Visual Grounding

youngfly11/LCMCG-PyTorch • • AAAI-2020 2020

To address their limitations, this paper proposes a language-guided graph representation to capture the global context of grounding entities and their relations, and develop a cross-modal graph matching strategy for the multiple-phrase visual grounding task.

13 Feb 2020

Paper
Code

Natural Language Visual Grounding

Benchmarks Add a Result

Datasets

Latest papers

Content

Benchmarks

Add a Result