A Visual Attention Grounding Neural Model for Multimodal Machine Translation

EMNLP 2018 Mingyang ZhouRunxiang ChengYong Jae LeeZhou Yu

We introduce a novel multimodal machine translation model that utilizes parallel visual and textual information. Our model jointly optimizes the learning of a shared visual-language embedding and a translator... (read more)

PDF Abstract

Evaluation results from the paper


  Submit results from this paper to get state-of-the-art GitHub badges and help community compare results to other papers.