Caption Generation

116 papers with code • 1 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Caption Generation models and implementations

Most implemented papers

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

sgrvinod/a-PyTorch-Tutorial-to-Image-Captioning 10 Feb 2015

Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images.

Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks

adityac94/Grad_CAM_plus_plus 30 Oct 2017

Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision problems.

Recurrent Neural Network Regularization

wojzaremba/lstm 8 Sep 2014

We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units.

Microsoft COCO Captions: Data Collection and Evaluation Server

tylin/coco-caption 1 Apr 2015

In this paper we describe the Microsoft COCO Caption dataset and evaluation server.

Where to put the Image in an Image Caption Generator

mtanti/where-image2 27 Mar 2017

When a recurrent neural network language model is used for caption generation, the image information can be fed to the neural network either by directly incorporating it in the RNN -- conditioning the language model by `injecting' image features -- or in a layer following the RNN -- conditioning the language model by `merging' image features.

Scalable Bayesian Optimization Using Deep Neural Networks

automl/pybnn 19 Feb 2015

Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations.

Sequence to Sequence -- Video to Text

nasib-ullah/video-captioning-models-in-Pytorch 3 May 2015

Our LSTM model is trained on video-sentence pairs and learns to associate a sequence of video frames to a sequence of words in order to generate a description of the event in the video clip.

An Actor-Critic Algorithm for Sequence Prediction

rizar/actor-critic-public 24 Jul 2016

We present an approach to training neural networks to generate sequences using actor-critic methods from reinforcement learning (RL).

Deep Reinforcement Learning For Sequence to Sequence Models

yaserkl/RLSeq2Seq 24 May 2018

In this survey, we consider seq2seq problems from the RL point of view and provide a formulation combining the power of RL methods in decision-making with sequence-to-sequence models that enable remembering long-term memories.

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts

google-research-datasets/conceptual-12m CVPR 2021

The availability of large-scale image captioning and visual question answering datasets has contributed significantly to recent successes in vision-and-language pre-training.