Search Results for author: Ann Lee

Found 19 papers, 8 papers with code

Facebook AI’s WMT20 News Translation Task Submission

no code implementations WMT (EMNLP) 2020 Peng-Jen Chen, Ann Lee, Changhan Wang, Naman Goyal, Angela Fan, Mary Williamson, Jiatao Gu

We approach the low resource problem using two main strategies, leveraging all available data and adapting the system to the target news domain.

Data Augmentation Translation

Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation

no code implementations6 Apr 2022 Sravya Popuri, Peng-Jen Chen, Changhan Wang, Juan Pino, Yossi Adi, Jiatao Gu, Wei-Ning Hsu, Ann Lee

Direct speech-to-speech translation (S2ST) models suffer from data scarcity issues as there exists little parallel S2ST data, compared to the amount of data available for conventional cascaded systems that consist of automatic speech recognition (ASR), machine translation (MT), and text-to-speech (TTS) synthesis.

Automatic Speech Recognition Data Augmentation +4

textless-lib: a Library for Textless Spoken Language Processing

1 code implementation15 Feb 2022 Eugene Kharitonov, Jade Copet, Kushal Lakhotia, Tu Anh Nguyen, Paden Tomasello, Ann Lee, Ali Elkahky, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux, Yossi Adi

Textless spoken language processing research aims to extend the applicability of standard NLP toolset onto spoken language and languages with few or no textual resources.

Resynthesis

Flashlight: Enabling Innovation in Tools for Machine Learning

1 code implementation29 Jan 2022 Jacob Kahn, Vineel Pratap, Tatiana Likhomanenko, Qiantong Xu, Awni Hannun, Jeff Cai, Paden Tomasello, Ann Lee, Edouard Grave, Gilad Avidov, Benoit Steiner, Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

As the computational requirements for machine learning systems and the size and complexity of machine learning frameworks increases, essential framework innovation has become challenging.

Textless Speech-to-Speech Translation on Real Data

no code implementations15 Dec 2021 Ann Lee, Hongyu Gong, Paul-Ambroise Duquenne, Holger Schwenk, Peng-Jen Chen, Changhan Wang, Sravya Popuri, Yossi Adi, Juan Pino, Jiatao Gu, Wei-Ning Hsu

To our knowledge, we are the first to establish a textless S2ST technique that can be trained with real-world data and works for multiple language pairs.

Speech-to-Speech Translation Translation

Direct Simultaneous Speech-to-Speech Translation with Variational Monotonic Multihead Attention

no code implementations15 Oct 2021 Xutai Ma, Hongyu Gong, Danni Liu, Ann Lee, Yun Tang, Peng-Jen Chen, Wei-Ning Hsu, Phillip Koehn, Juan Pino

We present a direct simultaneous speech-to-speech translation (Simul-S2ST) model, Furthermore, the generation of translation is independent from intermediate text representations.

Speech Synthesis Speech-to-Speech Translation +1

Text-Free Prosody-Aware Generative Spoken Language Modeling

1 code implementation ACL 2022 Eugene Kharitonov, Ann Lee, Adam Polyak, Yossi Adi, Jade Copet, Kushal Lakhotia, Tu-Anh Nguyen, Morgane Rivière, Abdelrahman Mohamed, Emmanuel Dupoux, Wei-Ning Hsu

Generative Spoken Language Modeling (GSLM) \cite{Lakhotia2021} is the only prior work addressing the generative aspects of speech pre-training, which replaces text with discovered phone-like units for language modeling and shows the ability to generate meaningful novel sentences.

Language Modelling

Discriminative Reranking for Neural Machine Translation

no code implementations ACL 2021 Ann Lee, Michael Auli, Marc{'}Aurelio Ranzato

Reranking models enable the integration of rich features to select a better output hypothesis within an n-best list or lattice.

Data Augmentation Machine Translation +1

Direct speech-to-speech translation with discrete units

no code implementations ACL 2022 Ann Lee, Peng-Jen Chen, Changhan Wang, Jiatao Gu, Sravya Popuri, Xutai Ma, Adam Polyak, Yossi Adi, Qing He, Yun Tang, Juan Pino, Wei-Ning Hsu

When target text transcripts are available, we design a joint speech and text training framework that enables the model to generate dual modality output (speech and text) simultaneously in the same inference pass.

Speech-to-Speech Translation Text Generation +1

Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training

3 code implementations2 Apr 2021 Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Jacob Kahn, Ann Lee, Ronan Collobert, Gabriel Synnaeve, Michael Auli

On a large-scale competitive setup, we show that pre-training on unlabeled in-domain data reduces the gap between models trained on in-domain and out-of-domain labeled data by 66%-73%.

Self-Supervised Learning

Few-shot Sequence Learning with Transformers

no code implementations17 Dec 2020 Lajanugen Logeswaran, Ann Lee, Myle Ott, Honglak Lee, Marc'Aurelio Ranzato, Arthur Szlam

In the simplest setting, we append a token to an input sequence which represents the particular task to be undertaken, and show that the embedding of this token can be optimized on the fly given few labeled examples.

Few-Shot Learning

Facebook AI's WMT20 News Translation Task Submission

no code implementations16 Nov 2020 Peng-Jen Chen, Ann Lee, Changhan Wang, Naman Goyal, Angela Fan, Mary Williamson, Jiatao Gu

We approach the low resource problem using two main strategies, leveraging all available data and adapting the system to the target news domain.

Data Augmentation Translation

Semi-Supervised Speech Recognition via Local Prior Matching

1 code implementation24 Feb 2020 Wei-Ning Hsu, Ann Lee, Gabriel Synnaeve, Awni Hannun

For sequence transduction tasks like speech recognition, a strong structured prior model encodes rich information about the target space, implicitly ruling out invalid sequences by assigning them low probability.

Knowledge Distillation Speech Recognition

Self-Supervised Speech Recognition via Local Prior Matching

no code implementations25 Sep 2019 Wei-Ning Hsu, Ann Lee, Gabriel Synnaeve, Awni Hannun

We propose local prior matching (LPM), a self-supervised objective for speech recognition.

Speech Recognition

Self-Training for End-to-End Speech Recognition

no code implementations19 Sep 2019 Jacob Kahn, Ann Lee, Awni Hannun

We revisit self-training in the context of end-to-end speech recognition.

Speech Recognition

Sequence-to-Sequence Speech Recognition with Time-Depth Separable Convolutions

no code implementations4 Apr 2019 Awni Hannun, Ann Lee, Qiantong Xu, Ronan Collobert

Coupled with a convolutional language model, our time-depth separable convolution architecture improves by more than 22% relative WER over the best previously reported sequence-to-sequence results on the noisy LibriSpeech test set.

Sequence-To-Sequence Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.