Multimodal Machine Translation

35 papers with code • 3 benchmarks • 5 datasets

Multimodal machine translation is the task of doing machine translation with multiple data sources - for example, translating "a bird is flying over water" + an image of a bird over water to German text.

( Image credit: Findings of the Third Shared Task on Multimodal Machine Translation )

Benchmarks

Add a Result

These leaderboards are used to track progress in Multimodal Machine Translation

Dataset	Best Model	Compare
Multi30K	ERNIE-UniX2	See all
Hindi Visual Genome (Test Set)	ViTA	See all
Hindi Visual Genome (Challenge Set)	ViTA	See all

Libraries

Use these libraries to find Multimodal Machine Translation models and implementations

facebookresearch/seamless_communica…

2 papers

10,280

lium-lst/nmtpy

2 papers

126

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

Attention Is All You Need

tensorflow/tensor2tensor • • NeurIPS 2017

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration.

569

Paper
Code

On Vision Features in Multimodal Machine Translation

libeineu/fairseq_mmt • • ACL 2022

Previous work on multimodal machine translation (MMT) has focused on the way of incorporating vision features into translation but little attention is on the quality of vision models.

Paper
Code

Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation

matthieufp/vgamt • • 20 Dec 2022

One of the major challenges of machine translation (MT) is ambiguity, which can in some cases be resolved by accompanying context such as images.

Paper
Code

Multi30K: Multilingual English-German Image Descriptions

multi30k/dataset • WS 2016

We introduce the Multi30K dataset to stimulate multilingual multimodal research.

Paper
Code

Does Multimodality Help Human and Machine for Translation and Image Captioning?

lium-lst/nmtpy • WS 2016

This paper presents the systems developed by LIUM and CVC for the WMT16 Multimodal Machine Translation challenge.

Paper
Code

NMTPY: A Flexible Toolkit for Advanced Neural Machine Translation Systems

lium-lst/nmtpy • 1 Jun 2017

nmtpy has been used for LIUM's top-ranked submissions to WMT Multimodal Machine Translation and News Translation tasks in 2016 and 2017.

Paper
Code

Multimodal Lexical Translation

sheffieldnlp/mlt • LREC 2018

Paper
Code

A Visual Attention Grounding Neural Model for Multimodal Machine Translation

Eurus-Holmes/VAG-NMT • • EMNLP 2018

The model leverages a visual attention grounding mechanism that links the visual semantics with the corresponding textual semantics.

Paper
Code

Findings of the Third Shared Task on Multimodal Machine Translation

multi30k/dataset • WS 2018

In this task a source sentence in English is supplemented by an image and participating systems are required to generate a translation for such a sentence into German, French or Czech.

Paper
Code

UMONS Submission for WMT18 Multimodal Translation Task

jbdel/WMT18_MNMT • • 15 Oct 2018

This paper describes the UMONS solution for the Multimodal Machine Translation Task presented at the third conference on machine translation (WMT18).

Paper
Code

Multimodal Machine Translation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result