Referring Expression Comprehension

67 papers with code • 8 benchmarks • 8 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Referring Expression Comprehension models and implementations

Latest papers with no code

CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding

no code yet • 6 Nov 2023

A communication token is generated by the LLM following a visual entity or a relation, to inform the detection network to propose regions that are relevant to the sentence generated so far.

Video Referring Expression Comprehension via Transformer with Content-conditioned Query

no code yet • 25 Oct 2023

Video Referring Expression Comprehension (REC) aims to localize a target object in videos based on the queried natural language.

Switching Head-Tail Funnel UNITER for Dual Referring Expression Comprehension with Fetch-and-Carry Tasks

no code yet • 14 Jul 2023

The results show that our method outperforms the baseline method in terms of language comprehension accuracy.

Switch-BERT: Learning to Model Multimodal Interactions by Switching Attention and Input

no code yet • 25 Jun 2023

They can achieve exceptional performances on specific tasks, but face a particularly challenging problem of modality mismatch because of diversity of input modalities and their fixed structures.

Language-Guided 3D Object Detection in Point Cloud for Autonomous Driving

no code yet • 25 May 2023

In this work, we propose a new multi-modal visual grounding task, termed LiDAR Grounding.

NS3D: Neuro-Symbolic Grounding of 3D Objects and Relations

no code yet • CVPR 2023

Different functional modules in the programs are implemented as neural networks.

Dynamic Inference With Grounding Based Vision and Language Models

no code yet • CVPR 2023

For example, recent image and language models with more than 200M parameters have been proposed to learn visual grounding in the pre-training step and show impressive results on downstream vision and language tasks.

RefCLIP: A Universal Teacher for Weakly Supervised Referring Expression Comprehension

no code yet • CVPR 2023

Based on RefCLIP, we further propose the first model-agnostic weakly supervised training scheme for existing REC models, where RefCLIP acts as a mature teacher to generate pseudo-labels for teaching common REC models.

RefTeacher: A Strong Baseline for Semi-Supervised Referring Expression Comprehension

no code yet • CVPR 2023

In this paper, we present the first attempt of semi-supervised learning for REC and propose a strong baseline method called RefTeacher.

One for All: One-stage Referring Expression Comprehension with Dynamic Reasoning

no code yet • 31 Jul 2022

However, one unsolved issue of these models is that the number of reasoning steps needs to be pre-defined and fixed before inference, ignoring the varying complexity of expressions.