Visual Storytelling

25 papers with code • 1 benchmarks • 4 datasets

( Image credit: No Metrics Are Perfect )

Most implemented papers

Knowledge-Enriched Visual Storytelling

zychen423/KE-VIST 3 Dec 2019

This paper introduces KG-Story, a three-stage framework that allows the story generation model to take advantage of external Knowledge Graphs to produce interesting stories.

RoViST:Learning Robust Metrics for Visual Storytelling

usydnlp/rovist 8 May 2022

We measure the reliability of our metric sets by analysing its correlation with human judgement scores on a sample of machine stories obtained from 4 state-of-the-arts models trained on the Visual Storytelling Dataset (VIST).

Expressive Scene Graph Generation Using Commonsense Knowledge Infusion for Visual Understanding and Reasoning

jaleedkhan/neusire European Semantic Web Conference (ESWC) 2022

These results depict the effectiveness of commonsense knowledge infusion in improving the performance and expressiveness of scene graph generation for visual understanding and reasoning tasks.

Positional Diffusion: Ordering Unordered Sets with Diffusion Probabilistic Models

IIT-PAVIS/Positional_Diffusion 20 Mar 2023

We present Positional Diffusion, a plug-and-play graph formulation with Diffusion Probabilistic Models to address positional reasoning.

Detecting and Grounding Important Characters in Visual Stories

iz2late/VIST-Character 30 Mar 2023

Characters are essential to the plot of any story.

Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion Models

haoningwu3639/StoryGen 1 Jun 2023

Generative models have recently exhibited exceptional capabilities in text-to-image generation, but still struggle to generate image sequences coherently.

Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation

videocrafter/animate-a-story 13 Jul 2023

For the first module, we leverage an off-the-shelf video retrieval system and extract video depths as motion structure.

TouchStone: Evaluating Vision-Language Models by Language Models

ofa-sys/touchstone 31 Aug 2023

Large vision-language models (LVLMs) have recently witnessed rapid advancements, exhibiting a remarkable capacity for perceiving, understanding, and processing visual information by connecting visual receptor with large language models (LLMs).

Envisioning Narrative Intelligence: A Creative Visual Storytelling Anthology

USArmyResearchLab/ARL-Creative-Visual-Storytelling 6 Oct 2023

In this paper, we collect an anthology of 100 visual stories from authors who participated in our systematic creative process of improvised story-building based on image sequences.