Video Summarization

68 papers with code • 5 benchmarks • 13 datasets

Video Summarization aims to generate a short synopsis that summarizes the video content by selecting its most informative and important parts. The produced summary is usually composed of a set of representative video frames (a.k.a. video key-frames), or video fragments (a.k.a. video key-fragments) that have been stitched in chronological order to form a shorter video. The former type of a video summary is known as video storyboard, and the latter type is known as video skim.

Source: Video Summarization Using Deep Neural Networks: A Survey
Image credit: iJRASET

Latest papers with no code

V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning

no code yet • 18 Apr 2024

Recent efforts have been made to expand from unimodal to multimodal video summarization, categorizing the task into three sub-tasks based on the summary's modality: video-to-video (V2V), video-to-text (V2T), and a combination of video and text summarization (V2VT).

Scaling Up Video Summarization Pretraining with Large Language Models

no code yet • 4 Apr 2024

Long-form video content constitutes a significant portion of internet traffic, making automated video summarization an essential research problem.

FastPerson: Enhancing Video Learning through Effective Video Summarization that Preserves Linguistic and Visual Contexts

no code yet • 26 Mar 2024

Therefore, there is a risk of missing important information when both the teacher's speech and visual information on the blackboard or slides are important, such as in a lecture video.

Large Model based Sequential Keyframe Extraction for Video Summarization

no code yet • 10 Jan 2024

Keyframe extraction aims to sum up a video's semantics with the minimum number of its frames.

Beyond the Frame: Single and mutilple video summarization method with user-defined length

no code yet • 23 Dec 2023

A single or multiple videos can be summarized into a relatively short video using various of techniques from multimodal audio-visual techniques, to natural language processing approaches.

Facilitating the Production of Well-tailored Video Summaries for Sharing on Social Media

no code yet • 5 Dec 2023

This paper presents a web-based tool that facilitates the production of tailored summaries for online sharing on social media.

Video Summarization: Towards Entity-Aware Captions

no code yet • 1 Dec 2023

We also release a large-scale dataset, VIEWS (VIdeo NEWS), to support research on this task.

Scene Summarization: Clustering Scene Videos into Spatially Diverse Frames

no code yet • 28 Nov 2023

It aims to summarize a long video walkthrough of a scene into a small set of frames that are spatially diverse in the scene, which has many impotant applications, such as in surveillance, real estate, and robotics.

Conditional Modeling Based Automatic Video Summarization

no code yet • 20 Nov 2023

The aim of video summarization is to shorten videos automatically while retaining the key information necessary to convey the overall story.

Unsupervised Video Summarization

no code yet • 7 Nov 2023

This paper introduces a new, unsupervised method for automatic video summarization using ideas from generative adversarial networks but eliminating the discriminator, having a simple loss function, and separating training of different parts of the model.