Search Results for author: Juan Pino

Found 54 papers, 22 papers with code

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations IWSLT (ACL) 2022 Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN

no code implementations ACL (IWSLT) 2021 Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.

Translation

Multilingual Speech-to-Speech Translation into Multiple Target Languages

no code implementations17 Jul 2023 Hongyu Gong, Ning Dong, Sravya Popuri, Vedanuj Goswami, Ann Lee, Juan Pino

Despite a few studies on multilingual S2ST, their focus is the multilinguality on the source side, i. e., the translation from multiple source languages to one target language.

Language Identification Speech-to-Speech Translation +1

Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks

no code implementations4 May 2023 Yun Tang, Anna Y. Sun, Hirofumi Inaguma, Xinyue Chen, Ning Dong, Xutai Ma, Paden D. Tomasello, Juan Pino

In order to leverage strengths of both modeling methods, we propose a solution by combining Transducer and Attention based Encoder-Decoder (TAED) for speech-to-text tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Enhancing Speech-to-Speech Translation with Multiple TTS Targets

no code implementations10 Apr 2023 Jiatong Shi, Yun Tang, Ann Lee, Hirofumi Inaguma, Changhan Wang, Juan Pino, Shinji Watanabe

It has been known that direct speech-to-speech translation (S2ST) models usually suffer from the data scarcity issue because of the limited existing parallel materials for both source and target speech.

Speech-to-Speech Translation Speech-to-Text Translation +1

MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation

1 code implementation1 Mar 2023 Mohamed Anwar, Bowen Shi, Vedanuj Goswami, Wei-Ning Hsu, Juan Pino, Changhan Wang

We introduce MuAViC, a multilingual audio-visual corpus for robust speech recognition and robust speech-to-text translation providing 1200 hours of audio-visual speech in 9 languages.

Audio-Visual Speech Recognition Robust Speech Recognition +4

Pre-training for Speech Translation: CTC Meets Optimal Transport

1 code implementation27 Jan 2023 Phuong-Hang Le, Hongyu Gong, Changhan Wang, Juan Pino, Benjamin Lecouteux, Didier Schwab

Nevertheless, CTC is only a partial solution and thus, in our second contribution, we propose a novel pre-training method combining CTC and optimal transport to further reduce this gap.

Multi-Task Learning Speech-to-Text Translation +1

UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units

1 code implementation15 Dec 2022 Hirofumi Inaguma, Sravya Popuri, Ilia Kulikov, Peng-Jen Chen, Changhan Wang, Yu-An Chung, Yun Tang, Ann Lee, Shinji Watanabe, Juan Pino

We enhance the model performance by subword prediction in the first-pass decoder, advanced two-pass decoder architecture design and search strategy, and better training regularization.

Denoising Speech-to-Speech Translation +3

Simple and Effective Unsupervised Speech Translation

no code implementations18 Oct 2022 Changhan Wang, Hirofumi Inaguma, Peng-Jen Chen, Ilia Kulikov, Yun Tang, Wei-Ning Hsu, Michael Auli, Juan Pino

The amount of labeled data to train models for speech tasks is limited for most languages, however, the data scarcity is exacerbated for speech translation which requires labeled data covering two different languages.

Machine Translation speech-recognition +6

Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation

no code implementations6 Apr 2022 Sravya Popuri, Peng-Jen Chen, Changhan Wang, Juan Pino, Yossi Adi, Jiatao Gu, Wei-Ning Hsu, Ann Lee

Direct speech-to-speech translation (S2ST) models suffer from data scarcity issues as there exists little parallel S2ST data, compared to the amount of data available for conventional cascaded systems that consist of automatic speech recognition (ASR), machine translation (MT), and text-to-speech (TTS) synthesis.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Direct Simultaneous Speech-to-Speech Translation with Variational Monotonic Multihead Attention

no code implementations15 Oct 2021 Xutai Ma, Hongyu Gong, Danni Liu, Ann Lee, Yun Tang, Peng-Jen Chen, Wei-Ning Hsu, Phillip Koehn, Juan Pino

We present a direct simultaneous speech-to-speech translation (Simul-S2ST) model, Furthermore, the generation of translation is independent from intermediate text representations.

Speech Synthesis Speech-to-Speech Translation +1

Multilingual Speech Translation from Efficient Finetuning of Pretrained Models

no code implementations ACL 2021 Xian Li, Changhan Wang, Yun Tang, Chau Tran, Yuqing Tang, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

We present a simple yet effective approach to build multilingual speech-to-text (ST) translation through efficient transfer learning from a pretrained speech encoder and text decoder.

Text Generation Transfer Learning +1

FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task

no code implementations ACL (IWSLT) 2021 Yun Tang, Hongyu Gong, Xian Li, Changhan Wang, Juan Pino, Holger Schwenk, Naman Goyal

In this paper, we describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign on the Multilingual Speech Translation shared task.

Transfer Learning Translation

Direct speech-to-speech translation with discrete units

1 code implementation ACL 2022 Ann Lee, Peng-Jen Chen, Changhan Wang, Jiatao Gu, Sravya Popuri, Xutai Ma, Adam Polyak, Yossi Adi, Qing He, Yun Tang, Juan Pino, Wei-Ning Hsu

When target text transcripts are available, we design a joint speech and text training framework that enables the model to generate dual modality output (speech and text) simultaneously in the same inference pass.

Speech-to-Speech Translation Text Generation +1

Pay Better Attention to Attention: Head Selection in Multilingual and Multi-Domain Sequence Modeling

no code implementations NeurIPS 2021 Hongyu Gong, Yun Tang, Juan Pino, Xian Li

We further propose attention sharing strategies to facilitate parameter sharing and specialization in multilingual and multi-domain sequence modeling.

speech-recognition Speech Recognition +2

Large-Scale Self- and Semi-Supervised Learning for Speech Translation

no code implementations14 Apr 2021 Changhan Wang, Anne Wu, Juan Pino, Alexei Baevski, Michael Auli, Alexis Conneau

In this paper, we improve speech translation (ST) through effectively leveraging large quantities of unlabeled speech and text data in different and complementary ways.

Language Modelling Translation

Dual-decoder Transformer for Joint Automatic Speech Recognition and Multilingual Speech Translation

1 code implementation COLING 2020 Hang Le, Juan Pino, Changhan Wang, Jiatao Gu, Didier Schwab, Laurent Besacier

We propose two variants of these architectures corresponding to two different levels of dependencies between the decoders, called the parallel and cross dual-decoder Transformers, respectively.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Multilingual Speech Translation with Efficient Finetuning of Pretrained Models

no code implementations24 Oct 2020 Xian Li, Changhan Wang, Yun Tang, Chau Tran, Yuqing Tang, Juan Pino, Alexei Baevski, Alexis Conneau, Michael Auli

We present a simple yet effective approach to build multilingual speech-to-text (ST) translation by efficient transfer learning from pretrained speech encoder and text decoder.

Cross-Lingual Transfer Text Generation +2

A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks

no code implementations21 Oct 2020 Yun Tang, Juan Pino, Changhan Wang, Xutai Ma, Dmitriy Genzel

We demonstrate that representing text input as phoneme sequences can reduce the difference between speech and text inputs, and enhance the knowledge transfer from text corpora to the speech to text tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

CoVoST 2 and Massively Multilingual Speech-to-Text Translation

2 code implementations20 Jul 2020 Changhan Wang, Anne Wu, Juan Pino

Speech translation has recently become an increasingly popular topic of research, partly due to the development of benchmark datasets.

Machine Translation speech-recognition +3

FINDINGS OF THE IWSLT 2020 EVALUATION CAMPAIGN

no code implementations WS 2020 Ebrahim Ansari, Amittai Axelrod, Nguyen Bach, Ond{\v{r}}ej Bojar, Roldano Cattoni, Fahim Dalvi, Nadir Durrani, Marcello Federico, Christian Federmann, Jiatao Gu, Fei Huang, Kevin Knight, Xutai Ma, Ajay Nagesh, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Xing Shi, Sebastian St{\"u}ker, Marco Turchi, Alex Waibel, er, Changhan Wang

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge tracks: (i) Simultaneous speech translation, (ii) Video speech translation, (iii) Offline speech translation, (iv) Conversational speech translation, (v) Open domain translation, and (vi) Non-native speech translation.

Translation

Self-Supervised Representations Improve End-to-End Speech Translation

no code implementations22 Jun 2020 Anne Wu, Changhan Wang, Juan Pino, Jiatao Gu

End-to-end speech-to-text translation can provide a simpler and smaller system but is facing the challenge of data scarcity.

Cross-Lingual Transfer speech-recognition +3

SkinAugment: Auto-Encoding Speaker Conversions for Automatic Speech Translation

1 code implementation27 Feb 2020 Arya D. McCarthy, Liezl Puzon, Juan Pino

Our method compares favorably to SpecAugment on English$\to$French and English$\to$Romanian automatic speech translation (AST) tasks as well as on a low-resource English automatic speech recognition (ASR) task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

CoVoST: A Diverse Multilingual Speech-To-Text Translation Corpus

1 code implementation LREC 2020 Changhan Wang, Juan Pino, Anne Wu, Jiatao Gu

Spoken language translation has recently witnessed a resurgence in popularity, thanks to the development of end-to-end models and the creation of new corpora, such as Augmented LibriSpeech and MuST-C.

Speech-to-Text Translation Translation

Monotonic Multihead Attention

3 code implementations ICLR 2020 Xutai Ma, Juan Pino, James Cross, Liezl Puzon, Jiatao Gu

Simultaneous machine translation models start generating a target sequence before they have encoded or read the source sequence.

Machine Translation Translation

Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions

no code implementations WS 2019 Philipp Koehn, Francisco Guzm{\'a}n, Vishrav Chaudhary, Juan Pino

Following the WMT 2018 Shared Task on Parallel Corpus Filtering, we posed the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting 2{\%} and 10{\%} of the highest-quality data to be used to train machine translation systems.

Machine Translation Sentence +1

Cannot find the paper you are looking for? You can Submit a new open access paper.