2 code implementations • 18 Jul 2024 • Matthieu Futeral, Cordelia Schmid, Benoît Sagot, Rachel Bawden
Current multimodal machine translation (MMT) systems rely on fully supervised data (i. e models are trained on sentences with their translations and accompanying images).
no code implementations • 13 Jun 2024 • Matthieu Futeral, Armel Zebaze, Pedro Ortiz Suarez, Julien Abadji, Rémi Lacroix, Cordelia Schmid, Rachel Bawden, Benoît Sagot
We additionally train two types of multilingual model to prove the benefits of mOSCAR: (1) a model trained on a subset of mOSCAR and captioning data and (2) a model train on captioning data only.
no code implementations • 16 Apr 2024 • Matthieu Futeral, Andrea Agostinelli, Marco Tagliasacchi, Neil Zeghidour, Eugene Kharitonov
Using these datasets, we demonstrate that our proposed metrics achieve a stronger agreement with the ground-truth diversity than baselines.
2 code implementations • 20 Dec 2022 • Matthieu Futeral, Cordelia Schmid, Ivan Laptev, Benoît Sagot, Rachel Bawden
One of the major challenges of machine translation (MT) is ambiguity, which can in some cases be resolved by accompanying context such as images.
no code implementations • ACL 2020 • Djam{\'e} Seddah, Farah Essaidi, Amal Fethi, Matthieu Futeral, Benjamin Muller, Pedro Javier Ortiz Su{\'a}rez, Beno{\^\i}t Sagot, Abhishek Srivastava
We introduce the first treebank for a romanized user-generated content variety of Algerian, a North-African Arabic dialect known for its frequent usage of code-switching.