Browse > Computer Vision > Image Captioning

Image Captioning

152 papers with code · Computer Vision

Leaderboards

No evaluation results yet. Help compare methods by submit evaluation metrics.

Greatest papers with code

Can Active Memory Replace Attention?

NeurIPS 2016 tensorflow/models

Several mechanisms to focus attention of a neural network on selected parts of its input or memory have been used successfully in deep learning models in recent years.

IMAGE CAPTIONING MACHINE TRANSLATION

Show and Tell: Lessons learned from the 2015 MSCOCO Image Captioning Challenge

21 Sep 2016tensorflow/models

Automatically describing the content of an image is a fundamental problem in artificial intelligence that connects computer vision and natural language processing.

IMAGE CAPTIONING

One Model To Learn Them All

16 Jun 2017tensorflow/tensor2tensor

We present a single model that yields good results on a number of problems spanning multiple domains.

IMAGE CAPTIONING IMAGE CLASSIFICATION MULTI-TASK LEARNING

Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models

7 Oct 2016facebookresearch/fairseq-py

We observe that our method consistently outperforms BS and previously proposed techniques for diverse decoding from neural sequence models.

IMAGE CAPTIONING MACHINE TRANSLATION QUESTION GENERATION TEXT GENERATION TIME SERIES

MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition

27 Jul 2016deepinsight/insightface

In this paper, we design a benchmark task and provide the associated datasets for recognizing face images and link them to corresponding entity keys in a knowledge base.

FACE RECOGNITION IMAGE CAPTIONING

Show and Tell: A Neural Image Caption Generator

CVPR 2015 karpathy/neuraltalk

Experiments on several datasets show the accuracy of the model and the fluency of the language it learns solely from image descriptions.

IMAGE CAPTIONING TEXT GENERATION

Deep Visual-Semantic Alignments for Generating Image Descriptions

CVPR 2015 karpathy/neuraltalk2

Our approach leverages datasets of images and their sentence descriptions to learn about the inter-modal correspondences between language and visual data.

IMAGE CAPTIONING

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

CVPR 2018 facebookresearch/pythia

Top-down visual attention mechanisms have been used extensively in image captioning and visual question answering (VQA) to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning.

IMAGE CAPTIONING VISUAL QUESTION ANSWERING

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

10 Feb 2015sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning

Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images.

IMAGE CAPTIONING

Grad-CAM: Why did you say that?

22 Nov 2016ramprs/grad-cam

We propose a technique for making Convolutional Neural Network (CNN)-based models more transparent by visualizing input regions that are 'important' for predictions -- or visual explanations.

IMAGE CAPTIONING VISUAL QUESTION ANSWERING