Browse > Computer Vision > Video > Video Retrieval

Video Retrieval

15 papers with code · Computer Vision
Subtask of Video

The objective of video retrieval is as follows: given a text query and a pool of candidate videos, select the video which corresponds to the text query. Typically, the videos are returned as a ranked list of candidates and scored via document retrieval metrics.

State-of-the-art leaderboards

Greatest papers with code

Deep Hashing with Category Mask for Fast Video Retrieval

22 Dec 2017willard-yuan/hashing-baseline-for-image-retrieval

This paper proposes an end-to-end deep hashing framework with category mask for fast video retrieval.

CODE GENERATION VIDEO RETRIEVAL

Talking Face Generation by Adversarially Disentangled Audio-Visual Representation

20 Jul 2018Hangz-nju-cuhk/Talking-Face-Generation-DAVS

Talking face generation aims to synthesize a sequence of face images that correspond to a clip of speech.

TALKING FACE GENERATION VIDEO RETRIEVAL

ECO: Efficient Convolutional Network for Online Video Understanding

ECCV 2018 mzolfaghari/ECO-efficient-video-understanding

In this paper, we introduce a network architecture that takes long-term content into account and enables fast per-video processing at the same time.

#11 best model for Action Recognition In Videos on Something-Something V1 (using extra training data)

ACTION CLASSIFICATION ACTION RECOGNITION IN VIDEOS VIDEO CAPTIONING VIDEO RETRIEVAL VIDEO UNDERSTANDING

Dual Encoding for Zero-Example Video Retrieval

CVPR 2019 danieljf24/dual_encoding

This paper attacks the challenging problem of zero-example video retrieval.

VIDEO RETRIEVAL

Learning a Text-Video Embedding from Incomplete and Heterogeneous Data

7 Apr 2018antoine77340/Mixture-of-Embedding-Experts

We evaluate our method on the task of video retrieval and report results for the MPII Movie Description and MSR-VTT datasets.

#2 best model for Video Retrieval on LSMDC (using extra training data)

VIDEO RETRIEVAL

A Joint Sequence Fusion Model for Video Question Answering and Retrieval

ECCV 2018 antoine77340/howto100m

We present an approach named JSFusion (Joint Sequence Fusion) that can measure semantic similarity between any pairs of multimodal sequence data (e. g. a video clip and a language sentence).

QUESTION ANSWERING SEMANTIC TEXTUAL SIMILARITY VIDEO QUESTION ANSWERING VIDEO RETRIEVAL VISUAL QUESTION ANSWERING

Use What You Have: Video Retrieval Using Representations From Collaborative Experts

31 Jul 2019albanie/collaborative-experts

Our goal is to condense the multi-modal, extremely high dimensional information from videos into a single, compact video representation for the task of video retrieval using free-form text queries, where the degree of specificity is open-ended.

VIDEO RETRIEVAL

Central Similarity Hashing via Hadamard matrix

1 Aug 2019yuanli2333/Hadamard-Matrix-for-hashing

The target of central similarity learning is to encourage hash codes for similar data pairs to be close to a common center and those for dissimilar pairs to converge to different centers in the Hamming space, which substantially improves retrieval accuracy.

VIDEO RETRIEVAL

Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval

ICMR 2018 niluthpol/multimodal_vtt

Constructing a joint representation invariant across different modalities (e. g., video, language) is of significant importance in many multimedia applications.

VIDEO RETRIEVAL