simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions

EMNLP 2018 Fenglin LiuXuancheng RenYuanxin LiuHoufeng WangXu Sun

The encode-decoder framework has shown recent success in image captioning. Visual attention, which is good at detailedness, and semantic attention, which is good at comprehensiveness, have been separately proposed to ground the caption on the image... (read more)

PDF Abstract EMNLP 2018 PDF EMNLP 2018 Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet