Text Matching

133 papers with code • 0 benchmarks • 7 datasets

Matching a target text to a source text based on their meaning.

Most implemented papers

Visual Semantic Reasoning for Image-Text Matching

KunpengLi1994/VSRN ICCV 2019

It outperforms the current best method by 6. 8% relatively for image retrieval and 4. 8% relatively for caption retrieval on MS-COCO (Recall@1 using 1K test set).

Extractive Summarization as Text Matching

maszhongming/MatchSum ACL 2020

This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.

Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News

nguyenvo09/EMNLP2020 EMNLP 2020

The search can directly warn fake news posters and online users (e. g. the posters' followers) about misinformation, discourage them from spreading fake news, and scale up verified content on social media.

Identifying Machine-Paraphrased Plagiarism

jpwahle/iconf22-paraphrase 22 Mar 2021

Employing paraphrasing tools to conceal plagiarized text is a severe threat to academic integrity.

ActionCLIP: A New Paradigm for Video Action Recognition

sallymmx/actionclip 17 Sep 2021

Moreover, to handle the deficiency of label texts and make use of tremendous web data, we propose a new paradigm based on this multimodal learning framework for action recognition, which we dub "pre-train, prompt and fine-tune".

ECCV Caption: Correcting False Negatives by Collecting Machine-and-Human-verified Image-Caption Associations for MS-COCO

naver-ai/eccv-caption 7 Apr 2022

Image-Text matching (ITM) is a common task for evaluating the quality of Vision and Language (VL) models.

Dissecting Deep Metric Learning Losses for Image-Text Retrieval

littleredxh/vse-gradient 21 Oct 2022

In the event that the gradients are not integrable to a valid loss function, we implement our proposed objectives such that they would directly operate in the gradient space instead of on the losses in the embedding space.

DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting

vitae-transformer/deepsolo CVPR 2023

In this paper, we present DeepSolo, a simple DETR-like baseline that lets a single Decoder with Explicit Points Solo for text detection and recognition simultaneously.

Self-supervised vision-language pretraining for Medical visual question answering

pengfeiliheu/m2i2 24 Nov 2022

Medical image visual question answering (VQA) is a task to answer clinical questions, given a radiographic image, which is a challenging problem that requires a model to integrate both vision and language information.

Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-modal Structured Representations

zjukg/structure-clip 6 May 2023

In this paper, we present an end-to-end framework Structure-CLIP, which integrates Scene Graph Knowledge (SGK) to enhance multi-modal structured representations.