Story Visualization

19 papers with code • 3 benchmarks • 1 datasets

Story Visualization is the task of generating coherent and aligned sequence of images given a sequence of textual captions representing description of a story. It mainly consists of two tasks: story generation and story continuation, where story continuation uses additional ground truth information in the form of the first frame.

Datasets


Masked Generative Story Transformer with Character Guidance and Caption Augmentation

chrispapa2000/maskgst 13 Mar 2024

Story Visualization (SV) is a challenging generative vision task, that requires both visual quality and consistency between different frames in generated image sequences.

1
13 Mar 2024

Training-Free Consistent Text-to-Image Generation

kousw/experimental-consistory 5 Feb 2024

Text-to-image models offer a new level of creative flexibility by allowing users to guide the image generation process through natural language.

22
05 Feb 2024

StoryGPT-V: Large Language Models as Consistent Story Visualizers

xiaoqian-shen/StoryGPT-V 4 Dec 2023

Therefore, we introduce \textbf{StoryGPT-V}, which leverages the merits of the latent diffusion (LDM) and LLM to produce images with consistent and high-quality characters grounded on given story descriptions.

31
04 Dec 2023

The Chosen One: Consistent Characters in Text-to-Image Diffusion Models

ZichengDuan/TheChosenOne 16 Nov 2023

Our quantitative analysis demonstrates that our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods, and these findings are reinforced by a user study.

186
16 Nov 2023

Story Visualization by Online Text Augmentation with Context Memory

yonseivnl/cmota ICCV 2023

Story visualization (SV) is a challenging text-to-image generation task for the difficulty of not only rendering visual details from the text descriptions but also encoding a long-term context across multiple sentences.

7
15 Aug 2023

Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models

haoningwu3639/StoryGen 1 Jun 2023

Generative models have recently exhibited exceptional capabilities in text-to-image generation, but still struggle to generate image sequences coherently.

131
01 Jun 2023

TaleCrafter: Interactive Story Visualization with Multiple Characters

videocrafter/talecrafter 29 May 2023

Accurate Story visualization requires several necessary elements, such as identity consistency across frames, the alignment between plain text and visual content, and a reasonable layout of objects in images.

231
29 May 2023

Make-A-Story: Visual Memory Conditioned Consistent Story Generation

ubc-vision/make-a-story CVPR 2023

Our experiments for story generation on the MUGEN, the PororoSV and the FlintstonesSV dataset show that our method not only outperforms prior state-of-the-art in generating frames with high visual quality, which are consistent with the story, but also models appropriate correspondences between the characters and the background.

33
23 Nov 2022

Synthesizing Coherent Story with Auto-Regressive Latent Diffusion Models

xichenpan/ARLDM 20 Nov 2022

Conditioned diffusion models have demonstrated state-of-the-art text-to-image synthesis capacity.

171
20 Nov 2022

Character-Centric Story Visualization via Visual Planning and Token Alignment

sairin1202/vp-csv 16 Oct 2022

This task requires machines to 1) understand long text inputs and 2) produce a globally consistent image sequence that illustrates the contents of the story.

5
16 Oct 2022