Search Results for author: Michael Auli

Found 75 papers, 39 papers with code

On-demand compute reduction with stochastic wav2vec 2.0

no code implementations25 Apr 2022 Apoorv Vyas, Wei-Ning Hsu, Michael Auli, Alexei Baevski

Our results for models pre-trained on 960h Librispeech dataset and fine-tuned on 10h of transcribed data show that using the same stochastic model, we get a smooth trade-off between word error rate (WER) and inference time with only marginal WER degradation compared to the W2V2 and SEW models trained for a specific setting.

Towards End-to-end Unsupervised Speech Recognition

1 code implementation5 Apr 2022 Alexander H. Liu, Wei-Ning Hsu, Michael Auli, Alexei Baevski

Unsupervised speech recognition has shown great potential to make Automatic Speech Recognition (ASR) systems accessible to every language.

Automatic Speech Recognition Unsupervised Speech Recognition

XTREME-S: Evaluating Cross-lingual Speech Representations

no code implementations21 Mar 2022 Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson

Covering 102 languages from 10+ language families, 3 different domains and 4 task families, XTREME-S aims to simplify multilingual speech representation evaluation, as well as catalyze research in "universal" speech representation learning.

Representation Learning Speech Recognition +2

Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training

no code implementations1 Mar 2022 Ramon Sanabria, Wei-Ning Hsu, Alexei Baevski, Michael Auli

In this paper, we present a controlled study to better understand the effect of such factors on the performance of pre-trained representations.

Automatic Speech Recognition

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

4 code implementations Preprint 2022 Alexei Baevski, Wei-Ning Hsu, Qiantong Xu, Arun Babu, Jiatao Gu, Michael Auli

While the general idea of self-supervised learning is identical across modalities, the actual algorithms and objectives differ widely because they were developed with a single modality in mind.

Image Classification Linguistic Acceptability +5

Simple and Effective Zero-shot Cross-lingual Phoneme Recognition

2 code implementations23 Sep 2021 Qiantong Xu, Alexei Baevski, Michael Auli

Recent progress in self-training, self-supervised pretraining and unsupervised learning enabled well performing speech recognition systems without any labeled data.

Speech Recognition Transfer Learning +1

Discriminative Reranking for Neural Machine Translation

no code implementations ACL 2021 Ann Lee, Michael Auli, Marc{'}Aurelio Ranzato

Reranking models enable the integration of rich features to select a better output hypothesis within an n-best list or lattice.

Data Augmentation Machine Translation +1

Multilingual Speech Translation from Efficient Finetuning of Pretrained Models

no code implementations ACL 2021 Xian Li, Changhan Wang, Yun Tang, Chau Tran, Yuqing Tang, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

We present a simple yet effective approach to build multilingual speech-to-text (ST) translation through efficient transfer learning from a pretrained speech encoder and text decoder.

Text Generation Transfer Learning +1

Unsupervised Speech Recognition

3 code implementations NeurIPS 2021 Alexei Baevski, Wei-Ning Hsu, Alexis Conneau, Michael Auli

Despite rapid progress in the recent past, current speech recognition systems still require labeled training data which limits this technology to a small fraction of the languages spoken around the globe.

Speech Recognition Unsupervised Speech Recognition

Large-Scale Self- and Semi-Supervised Learning for Speech Translation

no code implementations14 Apr 2021 Changhan Wang, Anne Wu, Juan Pino, Alexei Baevski, Michael Auli, Alexis Conneau

In this paper, we improve speech translation (ST) through effectively leveraging large quantities of unlabeled speech and text data in different and complementary ways.

Language Modelling Translation

Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

2 code implementations2 Apr 2021 Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Jacob Kahn, Ann Lee, Ronan Collobert, Gabriel Synnaeve, Michael Auli

On a large-scale competitive setup, we show that pre-training on unlabeled in-domain data reduces the gap between models trained on in-domain and out-of-domain labeled data by 66%-73%.

Self-Supervised Learning

Reservoir Transformers

no code implementations ACL 2021 Sheng Shen, Alexei Baevski, Ari S. Morcos, Kurt Keutzer, Michael Auli, Douwe Kiela

We demonstrate that transformers obtain impressive performance even when some of the layers are randomly initialized and never updated.

Language Modelling Machine Translation +1

Language Models not just for Pre-training: Fast Online Neural Noisy Channel Modeling

1 code implementation WMT (EMNLP) 2020 Shruti Bhosale, Kyra Yee, Sergey Edunov, Michael Auli

Pre-training models on vast quantities of unlabeled data has emerged as an effective approach to improving accuracy on many NLP tasks.

 Ranked #1 on Machine Translation on WMT2016 Romanian-English (using extra training data)

Machine Translation Translation

Multilingual Speech Translation with Efficient Finetuning of Pretrained Models

no code implementations24 Oct 2020 Xian Li, Changhan Wang, Yun Tang, Chau Tran, Yuqing Tang, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

We present a simple yet effective approach to build multilingual speech-to-text (ST) translation by efficient transfer learning from pretrained speech encoder and text decoder.

Cross-Lingual Transfer Text Generation +2

A Comparison of Discrete Latent Variable Models for Speech Representation Learning

no code implementations24 Oct 2020 Henry Zhou, Alexei Baevski, Michael Auli

Neural latent variable models enable the discovery of interesting structure in speech audio data.

Representation Learning

Beyond English-Centric Multilingual Machine Translation

4 code implementations21 Oct 2020 Angela Fan, Shruti Bhosale, Holger Schwenk, Zhiyi Ma, Ahmed El-Kishky, Siddharth Goyal, Mandeep Baines, Onur Celebi, Guillaume Wenzek, Vishrav Chaudhary, Naman Goyal, Tom Birch, Vitaliy Liptchinsky, Sergey Edunov, Edouard Grave, Michael Auli, Armand Joulin

Existing work in translation demonstrated the potential of massively multilingual machine translation by training a single model able to translate between any pair of languages.

Machine Translation Translation

Unsupervised Cross-lingual Representation Learning for Speech Recognition

4 code implementations24 Jun 2020 Alexis Conneau, Alexei Baevski, Ronan Collobert, Abdel-rahman Mohamed, Michael Auli

This paper presents XLSR which learns cross-lingual speech representations by pretraining a single model from the raw waveform of speech in multiple languages.

Quantization Representation Learning +1

wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations

16 code implementations NeurIPS 2020 Alexei Baevski, Henry Zhou, Abdel-rahman Mohamed, Michael Auli

We show for the first time that learning powerful representations from speech audio alone followed by fine-tuning on transcribed speech can outperform the best semi-supervised methods while being conceptually simpler.

 Ranked #1 on Speech Recognition on TIMIT (using extra training data)

Quantization Self-Supervised Learning +1

Robust and On-the-fly Dataset Denoising for Image Classification

no code implementations ECCV 2020 Jiaming Song, Lunjia Hu, Michael Auli, Yann Dauphin, Tengyu Ma

We address this problem by reasoning counterfactually about the loss distribution of examples with uniform random labels had they were trained with the real examples, and use this information to remove noisy examples from the training set.

Classification Denoising +2

Improving Conditioning in Context-Aware Sequence to Sequence Models

no code implementations21 Nov 2019 Xinyi Wang, Jason Weston, Michael Auli, Yacine Jernite

Neural sequence to sequence models are well established for applications which can be cast as mapping a single input sequence into a single output sequence.

abstractive question answering Data Augmentation +2

Effectiveness of self-supervised pre-training for speech recognition

2 code implementations10 Nov 2019 Alexei Baevski, Michael Auli, Abdel-rahman Mohamed

We compare self-supervised representation learning algorithms which either explicitly quantize the audio data or learn representations without quantization.

Quantization Representation Learning +1

Depth-Adaptive Transformer

no code implementations ICLR 2020 Maha Elbayad, Jiatao Gu, Edouard Grave, Michael Auli

State of the art sequence-to-sequence models for large scale tasks perform a fixed number of computations for each input sequence regardless of whether it is easy or hard to process.

Machine Translation Translation

vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations

3 code implementations ICLR 2020 Alexei Baevski, Steffen Schneider, Michael Auli

We propose vq-wav2vec to learn discrete representations of audio segments through a wav2vec-style self-supervised context prediction task.

Ranked #2 on Speech Recognition on TIMIT (using extra training data)

General Classification Self-Supervised Learning +1

The Source-Target Domain Mismatch Problem in Machine Translation

no code implementations EACL 2021 Jiajun Shen, Peng-Jen Chen, Matt Le, Junxian He, Jiatao Gu, Myle Ott, Michael Auli, Marc'Aurelio Ranzato

While we live in an increasingly interconnected world, different places still exhibit strikingly different cultures and many events we experience in our every day life pertain only to the specific place we live in.

Machine Translation Translation

Simple and Effective Noisy Channel Modeling for Neural Machine Translation

1 code implementation IJCNLP 2019 Kyra Yee, Nathan Ng, Yann N. Dauphin, Michael Auli

Previous work on neural noisy channel modeling relied on latent variable models that incrementally process the source and target sentence.

Machine Translation Translation

ELI5: Long Form Question Answering

2 code implementations ACL 2019 Angela Fan, Yacine Jernite, Ethan Perez, David Grangier, Jason Weston, Michael Auli

We introduce the first large-scale corpus for long-form question answering, a task requiring elaborate and in-depth answers to open-ended questions.

Language Modelling Question Answering

GLOSS: Generative Latent Optimization of Sentence Representations

1 code implementation15 Jul 2019 Sidak Pal Singh, Angela Fan, Michael Auli

Both are trained to reconstruct the sentence based on a latent code and our model can be used to generate text.

Sentence Embedding

Better Generalization with On-the-fly Dataset Denoising

no code implementations ICLR 2019 Jiaming Song, Tengyu Ma, Michael Auli, Yann Dauphin

Memorization in over-parameterized neural networks can severely hurt generalization in the presence of mislabeled examples.


wav2vec: Unsupervised Pre-training for Speech Recognition

5 code implementations11 Apr 2019 Steffen Schneider, Alexei Baevski, Ronan Collobert, Michael Auli

Our experiments on WSJ reduce WER of a strong character-based log-mel filterbank baseline by up to 36% when only a few hours of transcribed data is available.

Ranked #5 on Speech Recognition on TIMIT (using extra training data)

General Classification Speech Recognition +1

fairseq: A Fast, Extensible Toolkit for Sequence Modeling

5 code implementations NAACL 2019 Myle Ott, Sergey Edunov, Alexei Baevski, Angela Fan, Sam Gross, Nathan Ng, David Grangier, Michael Auli

fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks.

Language Modelling Text Generation +1

Pre-trained Language Model Representations for Language Generation

1 code implementation NAACL 2019 Sergey Edunov, Alexei Baevski, Michael Auli

Pre-trained language model representations have been successful in a wide range of language understanding tasks.

14 Abstractive Text Summarization +4

Cloze-driven Pretraining of Self-attention Networks

no code implementations IJCNLP 2019 Alexei Baevski, Sergey Edunov, Yinhan Liu, Luke Zettlemoyer, Michael Auli

We present a new approach for pretraining a bi-directional transformer model that provides significant performance gains across a variety of language understanding problems.

Constituency Parsing Named Entity Recognition +3

Modeling Human Motion with Quaternion-based Neural Networks

1 code implementation21 Jan 2019 Dario Pavllo, Christoph Feichtenhofer, Michael Auli, David Grangier

Previous work on predicting or generating 3D human pose sequences regresses either joint rotations or joint positions.

Wizard of Wikipedia: Knowledge-Powered Conversational agents

2 code implementations ICLR 2019 Emily Dinan, Stephen Roller, Kurt Shuster, Angela Fan, Michael Auli, Jason Weston

In open-domain dialogue intelligent agents should exhibit the use of knowledge, however there are few convincing demonstrations of this to date.

Dialogue Generation

Adaptive Input Representations for Neural Language Modeling

2 code implementations ICLR 2019 Alexei Baevski, Michael Auli

We introduce adaptive input representations for neural language modeling which extend the adaptive softmax of Grave et al. (2017) to input representations of variable capacity.

Language Modelling

Understanding Back-Translation at Scale

3 code implementations EMNLP 2018 Sergey Edunov, Myle Ott, Michael Auli, David Grangier

An effective method to improve neural machine translation with monolingual data is to augment the parallel training corpus with back-translations of target language sentences.

Ranked #2 on Machine Translation on WMT2014 English-German (using extra training data)

14 Machine Translation +1

Scaling Neural Machine Translation

5 code implementations WS 2018 Myle Ott, Sergey Edunov, David Grangier, Michael Auli

Sequence to sequence learning models still require several days to reach state of the art performance on large benchmark datasets using a single machine.

14 Machine Translation +2

QuaterNet: A Quaternion-based Recurrent Model for Human Motion

1 code implementation16 May 2018 Dario Pavllo, David Grangier, Michael Auli

Deep learning for predicting or generating 3D human pose sequences is an active research area.

3D Human Pose Estimation Motion Estimation

Analyzing Uncertainty in Neural Machine Translation

1 code implementation ICML 2018 Myle Ott, Michael Auli, David Grangier, Marc'Aurelio Ranzato

We propose tools and metrics to assess how uncertainty in the data is captured by the model distribution and how it affects search strategies that generate translations.

Machine Translation Translation

Controllable Abstractive Summarization

no code implementations WS 2018 Angela Fan, David Grangier, Michael Auli

Current models for document summarization disregard user preferences such as the desired length, style, the entities that the user might be interested in, or how much of the document the user has already read.

Abstractive Text Summarization Document Summarization

Classical Structured Prediction Losses for Sequence to Sequence Learning

1 code implementation NAACL 2018 Sergey Edunov, Myle Ott, Michael Auli, David Grangier, Marc'Aurelio Ranzato

There has been much recent work on training neural attention models at the sequence-level using either reinforcement learning-style methods or by optimizing the beam.

14 Abstractive Text Summarization +4

Convolutional Sequence to Sequence Learning

33 code implementations ICML 2017 Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, Yann N. Dauphin

The prevalent approach to sequence to sequence learning maps an input sequence to a variable length output sequence via recurrent neural networks.

14 Machine Translation +1

Iterative Refinement for Machine Translation

no code implementations20 Oct 2016 Roman Novak, Michael Auli, David Grangier

Existing machine translation decoding algorithms generate translations in a strictly monotonic fashion and never revisit previous decisions.

Machine Translation Translation

Vocabulary Selection Strategies for Neural Machine Translation

no code implementations1 Oct 2016 Gurvan L'Hostis, David Grangier, Michael Auli

Classical translation models constrain the space of possible outputs by selecting a subset of translation rules based on the input sentence.

Machine Translation Translation

Neural Network-based Word Alignment through Score Aggregation

no code implementations WS 2016 Joel Legrand, Michael Auli, Ronan Collobert

We present a simple neural network for word alignment that builds source and target word window representations to compute alignment scores for sentence pairs.

Word Alignment

Strategies for Training Large Vocabulary Neural Language Models

2 code implementations ACL 2016 Welin Chen, David Grangier, Michael Auli

Training neural network language models over large vocabularies is still computationally very costly compared to count-based models such as Kneser-Ney.

Machine Translation Speech Recognition +1

deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets

no code implementations IJCNLP 2015 Michel Galley, Chris Brockett, Alessandro Sordoni, Yangfeng Ji, Michael Auli, Chris Quirk, Margaret Mitchell, Jianfeng Gao, Bill Dolan

We introduce Discriminative BLEU (deltaBLEU), a novel metric for intrinsic evaluation of generated text in tasks that admit a diverse range of possible outputs.

Cannot find the paper you are looking for? You can Submit a new open access paper.