Story Visualization

20 papers with code • 3 benchmarks • 1 datasets

Story Visualization is the task of generating coherent and aligned sequence of images given a sequence of textual captions representing description of a story. It mainly consists of two tasks: story generation and story continuation, where story continuation uses additional ground truth information in the form of the first frame.


Most implemented papers

Character-Centric Story Visualization via Visual Planning and Token Alignment

sairin1202/vp-csv 16 Oct 2022

This task requires machines to 1) understand long text inputs and 2) produce a globally consistent image sequence that illustrates the contents of the story.

Show Me a Story: Towards Coherent Neural Story Illustration

Hareesh-Ravi/Show-Me-A-Story CVPR 2018

We propose an end-to-end network for the visual illustration of a sequence of sentences forming a story.

StoryGAN: A Sequential Conditional GAN for Story Visualization

yitong91/StoryGAN CVPR 2019

We therefore propose a new story-to-image-sequence generation model, StoryGAN, based on the sequential conditional GAN framework.

Improving Generation and Evaluation of Visual Stories via Semantic Consistency

adymaharana/StoryViz NAACL 2021

Therefore, we also provide an exploration of evaluation metrics for the model, focused on aspects of the generated frames such as the presence/quality of generated characters, the relevance to captions, and the diversity of the generated images.

Integrating Visuospatial, Linguistic and Commonsense Structure into Story Visualization

adymaharana/vlcstorygan 21 Oct 2021

Prior work in this domain has shown that there is ample room for improvement in the generated image sequence in terms of visual quality, consistency and relevance.

Word-Level Fine-Grained Story Visualization

mrlibw/word-level-story-visualization 3 Aug 2022

Story visualization aims to generate a sequence of images to narrate each sentence in a multi-sentence story with a global consistency across dynamic scenes and characters.

StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story Continuation

adymaharana/storydalle 13 Sep 2022

Hence, we first propose the task of story continuation, where the generated visual story is conditioned on a source image, allowing for better generalization to narratives with new characters.

Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

xichenpan/ARLDM 20 Nov 2022

Conditioned diffusion models have demonstrated state-of-the-art text-to-image synthesis capacity.

Make-A-Story: Visual Memory Conditioned Consistent Story Generation

ubc-vision/make-a-story CVPR 2023

Our experiments for story generation on the MUGEN, the PororoSV and the FlintstonesSV dataset show that our method not only outperforms prior state-of-the-art in generating frames with high visual quality, which are consistent with the story, but also models appropriate correspondences between the characters and the background.