Video Summarization

68 papers with code • 5 benchmarks • 13 datasets

Video Summarization aims to generate a short synopsis that summarizes the video content by selecting its most informative and important parts. The produced summary is usually composed of a set of representative video frames (a.k.a. video key-frames), or video fragments (a.k.a. video key-fragments) that have been stitched in chronological order to form a shorter video. The former type of a video summary is known as video storyboard, and the latter type is known as video skim.

Source: Video Summarization Using Deep Neural Networks: A Survey
Image credit: iJRASET

Latest papers with no code

Unsupervised Video Summarization

no code yet • 7 Nov 2023

This paper introduces a new, unsupervised method for automatic video summarization using ideas from generative adversarial networks but eliminating the discriminator, having a simple loss function, and separating training of different parts of the model.

Dynamic Non-monotone Submodular Maximization

no code yet • NeurIPS 2023

Through this reduction, we obtain the first dynamic algorithms to solve the non-monotone submodular maximization problem under the cardinality constraint $k$.

Video-CSR: Complex Video Digest Creation for Visual-Language Models

no code yet • 8 Oct 2023

We present a novel task and human annotated dataset for evaluating the ability for visual-language models to generate captions and summaries for real-world video clips, which we call Video-CSR (Captioning, Summarization and Retrieval).

Video-Teller: Enhancing Cross-Modal Generation with Fusion and Decoupling

no code yet • 8 Oct 2023

This paper proposes Video-Teller, a video-language foundation model that leverages multi-modal fusion and fine-grained modality alignment to significantly enhance the video-to-text generation task.

Does Video Summarization Require Videos? Quantifying the Effectiveness of Language in Video Summarization

no code yet • 18 Sep 2023

Video summarization remains a huge challenge in computer vision due to the size of the input videos to be summarized.

Saliency-based Video Summarization for Face Anti-spoofing

no code yet • 23 Aug 2023

Inspired by the visual saliency theory, we present a video summarization method for face anti-spoofing detection that aims to enhance the performance and efficiency of deep learning models by leveraging visual saliency.

Self-Attention Based Generative Adversarial Networks For Unsupervised Video Summarization

no code yet • 16 Jul 2023

Experimental results indicate that using a self-attention mechanism as the frame selection mechanism outperforms the state-of-the-art on SumMe and leads to comparable to state-of-the-art performance on TVSum and COGNIMUSE.

Causal Video Summarizer for Video Exploration

no code yet • 4 Jul 2023

Multi-modal video summarization has a video input and a text-based query input.

Query-based Video Summarization with Pseudo Label Supervision

no code yet • 4 Jul 2023

Existing datasets for manually labelled query-based video summarization are costly and thus small, limiting the performance of supervised deep video summarization models.

Key Frame Extraction with Attention Based Deep Neural Networks

no code yet • 21 Jun 2023

Automatic keyframe detection from videos is an exercise in selecting scenes that can best summarize the content for long videos.