Image Paragraph Captioning

5 papers with code • 1 benchmarks • 1 datasets

Image paragraph captioning involves generating a detailed, multi-sentence description of the content of an image.

Benchmarks

Add a Result

These leaderboards are used to track progress in Image Paragraph Captioning

Trend	Dataset	Best Model	Paper	Code	Compare
	Image Paragraph Captioning	HSGED(SLL)			See all

Datasets

Image Paragraph Captioning

Most implemented papers

Most implemented Social Latest No code

A Hierarchical Approach for Generating Descriptive Image Paragraphs

chenxinpeng/im2p • • CVPR 2017

Recent progress on image captioning has made it possible to generate novel sentences describing images in natural language, but compressing an image into a single sentence can describe visual content in only coarse detail.

Paper
Code

Training for Diversity in Image Paragraph Captioning

lukemelas/image-paragraph-captioning • • EMNLP 2018

Image paragraph captioning models aim to produce detailed descriptions of a source image.

Paper
Code

Context-Aware Visual Policy Network for Fine-Grained Image Captioning

daqingliu/CAVP • • 6 Jun 2019

With the maturity of visual detection techniques, we are more ambitious in describing visual content with open-vocabulary, fine-grained and free-form language, i. e., the task of image captioning.

Paper
Code

Matching Visual Features to Hierarchical Semantic Topics for Image Paragraph Captioning

dandanguo1993/vtcm-based-image-paragraph-caption • • 10 May 2021

Inspired by recent successes in integrating semantic topics into this task, this paper develops a plug-and-play hierarchical-topic-guided image paragraph generation framework, which couples a visual extractor with a deep topic model to guide the learning of a language model.

Paper
Code

VLIS: Unimodal Language Models Guide Multimodal Language Generation

jiwanchung/vlis • • 15 Oct 2023

Multimodal language generation, which leverages the synergy of language and vision, is a rapidly expanding field.

Paper
Code

Image Paragraph Captioning

Benchmarks Add a Result

Datasets

Most implemented papers

A Hierarchical Approach for Generating Descriptive Image Paragraphs

Training for Diversity in Image Paragraph Captioning

Context-Aware Visual Policy Network for Fine-Grained Image Captioning

Matching Visual Features to Hierarchical Semantic Topics for Image Paragraph Captioning

VLIS: Unimodal Language Models Guide Multimodal Language Generation

Content

Benchmarks

Add a Result