Visual Storytelling
33 papers with code • 1 benchmarks • 4 datasets
( Image credit: No Metrics Are Perfect )
Most implemented papers
No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling
Though impressive results have been achieved in visual captioning, the task of generating abstract stories from photo streams is still a little-tapped problem.
Contextualize, Show and Tell: A Neural Visual Storyteller
We present a neural model for generating short stories from image sequences, which extends the image description model by Vinyals et al. (Vinyals et al., 2015).
AESOP: Abstract Encoding of Stories, Objects, and Pictures
Visual storytelling and story comprehension are uniquely human skills that play a central role in how we learn about and experience the world.
Alfie: Democratising RGBA Image Generation With No $$$
Designs and artworks are ubiquitous across various creative fields, requiring graphic design skills and dedicated software to create compositions that include many graphical elements, such as logos, icons, symbols, and art scenes, which are integral to visual storytelling.
Visual Storytelling
We introduce the first dataset for sequential vision-to-language, and explore how this data may be used for the task of visual storytelling.
GLAC Net: GLocal Attention Cascading Networks for Multi-image Cued Story Generation
The task of multi-image cued story generation, such as visual storytelling dataset (VIST) challenge, is to compose multiple coherent sentences from a given sequence of images.
Discourse Parsing in Videos: A Multi-modal Appraoch
We propose the task of Visual Discourse Parsing, which requires understanding discourse relations among scenes in a video.
Knowledgeable Storyteller: A Commonsense-Driven Generative Model for Visual Storytelling
The visual storytelling (VST) task aims at generating a reasonable and coherent paragraph-level story with the image stream as input.
Visual Story Post-Editing
We introduce the first dataset for human edits of machine-generated visual stories and explore how these collected edits may be used for the visual story post-editing task.
Informative Visual Storytelling with Cross-modal Rules
To solve this problem, we propose a method to mine the cross-modal rules to help the model infer these informative concepts given certain visual input.