Search Results for author: Alexis Conneau

Found 31 papers, 18 papers with code

XTREME-S: Evaluating Cross-lingual Speech Representations

no code implementations21 Mar 2022 Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson

Covering 102 languages from 10+ language families, 3 different domains and 4 task families, XTREME-S aims to simplify multilingual speech representation evaluation, as well as catalyze research in "universal" speech representation learning.

Representation Learning Speech Recognition +2

mSLAM: Massively multilingual joint pre-training for speech and text

no code implementations3 Feb 2022 Ankur Bapna, Colin Cherry, Yu Zhang, Ye Jia, Melvin Johnson, Yong Cheng, Simran Khanuja, Jason Riesa, Alexis Conneau

We present mSLAM, a multilingual Speech and LAnguage Model that learns cross-lingual cross-modal representations of speech and text by pre-training jointly on large amounts of unlabeled speech and text in multiple languages.

Intent Classification Language Modelling +1

Multilingual Speech Translation from Efficient Finetuning of Pretrained Models

no code implementations ACL 2021 Xian Li, Changhan Wang, Yun Tang, Chau Tran, Yuqing Tang, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

We present a simple yet effective approach to build multilingual speech-to-text (ST) translation through efficient transfer learning from a pretrained speech encoder and text decoder.

Text Generation Transfer Learning +1

Unsupervised Speech Recognition

2 code implementations NeurIPS 2021 Alexei Baevski, Wei-Ning Hsu, Alexis Conneau, Michael Auli

Despite rapid progress in the recent past, current speech recognition systems still require labeled training data which limits this technology to a small fraction of the languages spoken around the globe.

Speech Recognition Unsupervised Speech Recognition

Larger-Scale Transformers for Multilingual Masked Language Modeling

no code implementations ACL (RepL4NLP) 2021 Naman Goyal, Jingfei Du, Myle Ott, Giri Anantharaman, Alexis Conneau

Our model also outperforms the RoBERTa-Large model on several English tasks of the GLUE benchmark by 0. 3% on average while handling 99 more languages.

Language Modelling Masked Language Modeling

Large-Scale Self- and Semi-Supervised Learning for Speech Translation

no code implementations14 Apr 2021 Changhan Wang, Anne Wu, Juan Pino, Alexei Baevski, Michael Auli, Alexis Conneau

In this paper, we improve speech translation (ST) through effectively leveraging large quantities of unlabeled speech and text data in different and complementary ways.

Language Modelling Translation

Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning

1 code implementation ICLR 2021 Beliz Gunel, Jingfei Du, Alexis Conneau, Ves Stoyanov

Our proposed fine-tuning objective leads to models that are more robust to different levels of noise in the fine-tuning training data, and can generalize better to related tasks with limited labeled data.

Contrastive Learning Data Augmentation +3

Multilingual Speech Translation with Efficient Finetuning of Pretrained Models

no code implementations24 Oct 2020 Xian Li, Changhan Wang, Yun Tang, Chau Tran, Yuqing Tang, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

We present a simple yet effective approach to build multilingual speech-to-text (ST) translation by efficient transfer learning from pretrained speech encoder and text decoder.

Cross-Lingual Transfer Text Generation +2

Unsupervised Cross-lingual Representation Learning for Speech Recognition

4 code implementations24 Jun 2020 Alexis Conneau, Alexei Baevski, Ronan Collobert, Abdel-rahman Mohamed, Michael Auli

This paper presents XLSR which learns cross-lingual speech representations by pretraining a single model from the raw waveform of speech in multiple languages.

Quantization Representation Learning +1

Unsupervised Cross-lingual Representation Learning at Scale

24 code implementations ACL 2020 Alexis Conneau, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, Veselin Stoyanov

We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale.

Cross-Lingual Transfer Language Modelling +2

Emerging Cross-lingual Structure in Pretrained Language Models

no code implementations ACL 2020 Shijie Wu, Alexis Conneau, Haoran Li, Luke Zettlemoyer, Veselin Stoyanov

We study the problem of multilingual masked language modeling, i. e. the training of a single model on concatenated text from multiple languages, and present a detailed study of several factors that influence why these models are so effective for cross-lingual transfer.

Cross-Lingual Transfer Language Modelling +3

Cross-lingual Language Model Pretraining

13 code implementations NeurIPS 2019 Guillaume Lample, Alexis Conneau

On unsupervised machine translation, we obtain 34. 3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU.

Language Modelling Natural Language Understanding +2

Phrase-Based \& Neural Unsupervised Machine Translation

no code implementations EMNLP 2018 Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, Marc{'}Aurelio Ranzato

Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs.

Denoising Translation +1

What you can cram into a single vector: Probing sentence embeddings for linguistic properties

5 code implementations3 May 2018 Alexis Conneau, German Kruszewski, Guillaume Lample, Loïc Barrault, Marco Baroni

Although much effort has recently been devoted to training high-quality sentence embeddings, we still have a poor understanding of what they are capturing.

General Classification Sentence Classification +1

Phrase-Based & Neural Unsupervised Machine Translation

15 code implementations EMNLP 2018 Guillaume Lample, Myle Ott, Alexis Conneau, Ludovic Denoyer, Marc'Aurelio Ranzato

Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs.

Translation Unsupervised Machine Translation

Word Translation Without Parallel Data

17 code implementations ICLR 2018 Alexis Conneau, Guillaume Lample, Marc'Aurelio Ranzato, Ludovic Denoyer, Hervé Jégou

We finally describe experiments on the English-Esperanto low-resource language pair, on which there only exists a limited amount of parallel data, to show the potential impact of our method in fully unsupervised machine translation.

Cross-Lingual Word Embeddings Translation +4

Learning Visually Grounded Sentence Representations

no code implementations NAACL 2018 Douwe Kiela, Alexis Conneau, Allan Jabri, Maximilian Nickel

We introduce a variety of models, trained on a supervised image captioning corpus to predict the image features for a given caption, to perform sentence representation grounding.

Language Modelling

Meta-Prod2Vec - Product Embeddings Using Side-Information for Recommendation

2 code implementations25 Jul 2016 Flavian Vasile, Elena Smirnova, Alexis Conneau

We propose Meta-Prod2vec, a novel method to compute item similarities for recommendation that leverages existing item metadata.

Cannot find the paper you are looking for? You can Submit a new open access paper.