The objective of video retrieval is as follows: given a text query and a pool of candidate videos, select the video which corresponds to the text query. Typically, the videos are returned as a ranked list of candidates and scored via document retrieval metrics.
Talking face generation aims to synthesize a sequence of face images that correspond to a clip of speech.
This paper proposes an end-to-end deep hashing framework with category mask for fast video retrieval.
In this paper, we introduce a network architecture that takes long-term content into account and enables fast per-video processing at the same time.
Ranked #28 on
Action Recognition
on Something-Something V1
(using extra training data)
ACTION CLASSIFICATION ACTION CLASSIFICATION ACTION RECOGNITION VIDEO CAPTIONING VIDEO RETRIEVAL VIDEO UNDERSTANDING
The objective of this paper is visual-only self-supervised video representation learning.
ACTION RECOGNITION OPTICAL FLOW ESTIMATION REPRESENTATION LEARNING VIDEO RETRIEVAL
In real-world applications, e. g. law enforcement and video retrieval, one often needs to search a certain person in long videos with just one portrait.
This report summarizes the results of the first edition of the challenge together with the findings of the participants.
The rapid growth of video on the internet has made searching for video content using natural language queries a significant challenge.
Ranked #1 on
Video Retrieval
on DiDeMo
This paper attacks the challenging problem of zero-example video retrieval.
The objective of this paper is self-supervised learning from video, in particular for representations for action recognition.
ACTION CLASSIFICATION ACTION CLASSIFICATION ACTION RECOGNITION OPTICAL FLOW ESTIMATION REPRESENTATION LEARNING SELF-SUPERVISED LEARNING VIDEO RETRIEVAL
In this work, we propose a new \emph{global} similarity metric, termed as \emph{central similarity}, with which the hash codes of similar data pairs are encouraged to approach a common center and those for dissimilar pairs to converge to different centers, to improve hash learning efficiency and retrieval accuracy.