Image Paragraph Captioning

5 papers with code • 1 benchmarks • 1 datasets

Image paragraph captioning involves generating a detailed, multi-sentence description of the content of an image.

Latest papers with no code

Enhancing image captioning with depth information using a Transformer-based framework

no code yet • 24 Jul 2023

As a result, we propose a cleaned version of the NYU-v2 dataset that is more consistent and informative.

Bypass Network for Semantics Driven Image Paragraph Captioning

no code yet • 21 Jun 2022

Most existing methods model the coherence through the topic transition that dynamically infers a topic vector from preceding sentences.

Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning

no code yet • 3 Jun 2022

Thanks to the strong zero-shot capability of foundation models, we start by constructing a rich semantic representation of the image (e. g., image tags, object attributes / locations, captions) as a structured textual prompt, called visual clues, using a vision foundation model.

Interactive Key-Value Memory-augmented Attention for Image Paragraph Captioning

no code yet • COLING 2020

In this paper, we propose an Interactive key-value Memory- augmented Attention model for image Paragraph captioning (IMAP) to keep track of the attention history (salient objects coverage information) along with the update-chain of the decoder state and therefore avoid generating repetitive or incomplete image descriptions.

Hierarchical Scene Graph Encoder-Decoder for Image Paragraph Captioning

no code yet • ACM International Conference on Multimedia 2020

We propose irredundant attention in SSG-RNN to improve the possibility of abstracting topics from rarely described sub-graphs and inheriting attention in WSG-RNN to generate more grounded sentences with the abstracted topics, both of which give rise to more distinctive paragraphs.

Improving Diversity and Reducing Redundancy in Paragraph Captions

no code yet • International Joint Conference on Neural Networks (IJCNN) 2020

The paragraphs generated from standard image captioning models lack in language diversity and contain redundant information.

Dual-CNN: A Convolutional language decoder for paragraph image captioning

no code yet • Neurocomputing 2020

Abstract The task of paragraph image captioning aims to generate a coherent paragraph describing a given image.

Convolutional Auto-encoding of Sentence Topics for Image Paragraph Generation

no code yet • 1 Aug 2019

A valid question is how to encapsulate such gists/topics that are worthy of mention from an image, and then describe the image from one topic to another but holistically with a coherent structure.

Look Deeper See Richer: Depth-aware Image Paragraph Captioning

no code yet • ACM International Conference on Multimedia 2018

Existing image paragraph captioning methods give a series of sentences to represent the objects and regions of interests, where the descriptions are essentially generated by feeding the image fragments containing objects and regions into conventional image single-sentence captioning models.

Diverse and Coherent Paragraph Generation from Images

no code yet • ECCV 2018

Paragraph generation from images, which has gained popularity recently, is an important task for video summarization, editing, and support of the disabled.