Search Results for author: Guillaume Wenzek

Found 10 papers, 5 papers with code

Generating Fact Checking Briefs

no code implementations EMNLP 2020 Angela Fan, Aleksandra Piktus, Fabio Petroni, Guillaume Wenzek, Marzieh Saeidi, Andreas Vlachos, Antoine Bordes, Sebastian Riedel

Fact checking at scale is difficult -- while the number of active fact checking websites is growing, it remains too small for the needs of the contemporary media ecosystem.

Fact Checking Question Answering

Beyond English-Centric Multilingual Machine Translation

4 code implementations21 Oct 2020 Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin

Existing work in translation demonstrated the potential of massively multilingual machine translation by training a single model able to translate between any pair of languages.

Machine Translation Translation

CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB

3 code implementations ACL 2021 Holger Schwenk, Guillaume Wenzek, Sergey Edunov, Edouard Grave, Armand Joulin

To evaluate the quality of the mined bitexts, we train NMT systems for most of the language pairs and evaluate them on TED, WMT and WAT test sets.

Translation

Unsupervised Cross-lingual Representation Learning at Scale

24 code implementations ACL 2020 Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov

We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale.

Cross-Lingual Transfer Language Modelling +2

Trans-gram, Fast Cross-lingual Word-embeddings

no code implementations EMNLP 2015 Jocelyn Coulmance, Jean-Marc Marty, Guillaume Wenzek, Amine Benhalloum

We introduce Trans-gram, a simple and computationally-efficient method to simultaneously learn and align wordembeddings for a variety of languages, using only monolingual data and a smaller set of sentence-aligned data.

Cross-Lingual Word Embeddings General Classification +4

Cannot find the paper you are looking for? You can Submit a new open access paper.