Self-critical Sequence Training for Image Captioning

CVPR 2017 Steven J. RennieEtienne MarcheretYoussef MrouehJarret RossVaibhava Goel

Recently it has been shown that policy-gradient methods for reinforcement learning can be utilized to train deep end-to-end systems directly on non-differentiable metrics for the task at hand. In this paper we consider the problem of optimizing image captioning systems using reinforcement learning, and show that by carefully optimizing our systems using the test metrics of the MSCOCO task, significant gains in performance can be realized... (read more)

PDF Abstract

Evaluation results from the paper

  Submit results from this paper to get state-of-the-art GitHub badges and help community compare results to other papers.