Search Results for author: Xutai Ma

Found 22 papers, 7 papers with code

FINDINGS OF THE IWSLT 2021 EVALUATION CAMPAIGN

no code implementations • ACL (IWSLT) 2021 • Antonios Anastasopoulos, Ondřej Bojar, Jacob Bremerman, Roldano Cattoni, Maha Elbayad, Marcello Federico, Xutai Ma, Satoshi Nakamura, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Alexander Waibel, Changhan Wang, Matthew Wiesner

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2021) featured this year four shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Multilingual speech translation, (iv) Low-resource speech translation.

Translation

Paper
Add Code

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations • IWSLT (ACL) 2022 • Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

Paper
Add Code

Seamless: Multilingual Expressive and Streaming Speech Translation

1 code implementation • 8 Dec 2023 • Seamless Communication, Loïc Barrault, Yu-An Chung, Mariano Coria Meglioli, David Dale, Ning Dong, Mark Duppenthaler, Paul-Ambroise Duquenne, Brian Ellis, Hady Elsahar, Justin Haaheim, John Hoffman, Min-Jae Hwang, Hirofumi Inaguma, Christopher Klaiber, Ilia Kulikov, Pengwei Li, Daniel Licht, Jean Maillard, Ruslan Mavlyutov, Alice Rakotoarison, Kaushik Ram Sadagopan, Abinesh Ramakrishnan, Tuan Tran, Guillaume Wenzek, Yilin Yang, Ethan Ye, Ivan Evtimov, Pierre Fernandez, Cynthia Gao, Prangthip Hansanti, Elahe Kalbassi, Amanda Kallet, Artyom Kozhevnikov, Gabriel Mejia Gonzalez, Robin San Roman, Christophe Touret, Corinne Wong, Carleigh Wood, Bokai Yu, Pierre Andrews, Can Balioglu, Peng-Jen Chen, Marta R. Costa-jussà, Maha Elbayad, Hongyu Gong, Francisco Guzmán, Kevin Heffernan, Somya Jain, Justine Kao, Ann Lee, Xutai Ma, Alex Mourachko, Benjamin Peloquin, Juan Pino, Sravya Popuri, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Anna Sun, Paden Tomasello, Changhan Wang, Jeff Wang, Skyler Wang, Mary Williamson

In this work, we introduce a family of models that enable end-to-end expressive and multilingual translations in a streaming fashion.

Multimodal Machine Translation Translation

10,253

Paper
Code

Efficient Monotonic Multihead Attention

no code implementations • 7 Dec 2023 • Xutai Ma, Anna Sun, Siqi Ouyang, Hirofumi Inaguma, Paden Tomasello

We introduce the Efficient Monotonic Multihead Attention (EMMA), a state-of-the-art simultaneous translation model with numerically-stable and unbiased monotonic alignment estimation.

Simultaneous Speech-to-Text Translation Translation

Paper
Add Code

SeamlessM4T: Massively Multilingual & Multimodal Machine Translation

2 code implementations • 22 Aug 2023 • Seamless Communication, Loïc Barrault, Yu-An Chung, Mariano Cora Meglioli, David Dale, Ning Dong, Paul-Ambroise Duquenne, Hady Elsahar, Hongyu Gong, Kevin Heffernan, John Hoffman, Christopher Klaiber, Pengwei Li, Daniel Licht, Jean Maillard, Alice Rakotoarison, Kaushik Ram Sadagopan, Guillaume Wenzek, Ethan Ye, Bapi Akula, Peng-Jen Chen, Naji El Hachem, Brian Ellis, Gabriel Mejia Gonzalez, Justin Haaheim, Prangthip Hansanti, Russ Howes, Bernie Huang, Min-Jae Hwang, Hirofumi Inaguma, Somya Jain, Elahe Kalbassi, Amanda Kallet, Ilia Kulikov, Janice Lam, Daniel Li, Xutai Ma, Ruslan Mavlyutov, Benjamin Peloquin, Mohamed Ramadan, Abinesh Ramakrishnan, Anna Sun, Kevin Tran, Tuan Tran, Igor Tufanov, Vish Vogeti, Carleigh Wood, Yilin Yang, Bokai Yu, Pierre Andrews, Can Balioglu, Marta R. Costa-jussà, Onur Celebi, Maha Elbayad, Cynthia Gao, Francisco Guzmán, Justine Kao, Ann Lee, Alexandre Mourachko, Juan Pino, Sravya Popuri, Christophe Ropers, Safiyyah Saleem, Holger Schwenk, Paden Tomasello, Changhan Wang, Jeff Wang, Skyler Wang

What does it take to create the Babel Fish, a tool that can help individuals translate speech between any two languages?

Ranked #1 on Speech-to-Speech Translation on CVSS (using extra training data)

Automatic Speech Recognition Speech-to-Speech Translation +3

10,253

Paper
Code

Hybrid Transducer and Attention based Encoder-Decoder Modeling for Speech-to-Text Tasks

no code implementations • 4 May 2023 • Yun Tang, Anna Y. Sun, Hirofumi Inaguma, Xinyue Chen, Ning Dong, Xutai Ma, Paden D. Tomasello, Juan Pino

In order to leverage strengths of both modeling methods, we propose a solution by combining Transducer and Attention based Encoder-Decoder (TAED) for speech-to-text tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

From Start to Finish: Latency Reduction Strategies for Incremental Speech Synthesis in Simultaneous Speech-to-Speech Translation

no code implementations • 15 Oct 2021 • Danni Liu, Changhan Wang, Hongyu Gong, Xutai Ma, Yun Tang, Juan Pino

Speech-to-speech translation (S2ST) converts input speech to speech in another language.

Data Augmentation Speech Synthesis +2

Paper
Add Code

Direct Simultaneous Speech-to-Speech Translation with Variational Monotonic Multihead Attention

no code implementations • 15 Oct 2021 • Xutai Ma, Hongyu Gong, Danni Liu, Ann Lee, Yun Tang, Peng-Jen Chen, Wei-Ning Hsu, Phillip Koehn, Juan Pino

We present a direct simultaneous speech-to-speech translation (Simul-S2ST) model, Furthermore, the generation of translation is independent from intermediate text representations.

Speech Synthesis Speech-to-Speech Translation +1

Paper
Add Code

Direct speech-to-speech translation with discrete units

1 code implementation • ACL 2022 • Ann Lee, Peng-Jen Chen, Changhan Wang, Jiatao Gu, Sravya Popuri, Xutai Ma, Adam Polyak, Yossi Adi, Qing He, Yun Tang, Juan Pino, Wei-Ning Hsu

When target text transcripts are available, we design a joint speech and text training framework that enables the model to generate dual modality output (speech and text) simultaneously in the same inference pass.

Speech-to-Speech Translation Text Generation +1

158

Paper
Code

SimulMT to SimulST: Adapting Simultaneous Text Translation to End-to-End Simultaneous Speech Translation

1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • Xutai Ma, Juan Pino, Philipp Koehn

Simultaneous text translation and end-to-end speech translation have recently made great progress but little work has combined these tasks together.

Translation

29,318

Paper
Code

Streaming Simultaneous Speech Translation with Augmented Memory Transformer

no code implementations • 30 Oct 2020 • Xutai Ma, Yongqiang Wang, Mohammad Javad Dousti, Philipp Koehn, Juan Pino

Transformer-based models have achieved state-of-the-art performance on speech translation tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

A General Multi-Task Learning Framework to Leverage Text Data for Speech to Text Tasks

no code implementations • 21 Oct 2020 • Yun Tang, Juan Pino, Changhan Wang, Xutai Ma, Dmitriy Genzel

We demonstrate that representing text input as phoneme sequences can reduce the difference between speech and text inputs, and enhance the knowledge transfer from text corpora to the speech to text tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

fairseq S2T: Fast Speech-to-Text Modeling with fairseq

3 code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Changhan Wang, Yun Tang, Xutai Ma, Anne Wu, Sravya Popuri, Dmytro Okhonko, Juan Pino

We introduce fairseq S2T, a fairseq extension for speech-to-text (S2T) modeling tasks such as end-to-end speech recognition and speech-to-text translation.

Ranked #8 on Speech-to-Text Translation on MuST-C EN->DE

Machine Translation Multi-Task Learning +4

125,545

Paper
Code

SimulEval: An Evaluation Toolkit for Simultaneous Translation

no code implementations • EMNLP 2020 • Xutai Ma, Mohammad Javad Dousti, Changhan Wang, Jiatao Gu, Juan Pino

We also adapt latency metrics from text simultaneous translation to the speech task.

Translation

Paper
Add Code

FINDINGS OF THE IWSLT 2020 EVALUATION CAMPAIGN

no code implementations • WS 2020 • Ebrahim Ansari, Amittai Axelrod, Nguyen Bach, Ond{\v{r}}ej Bojar, Roldano Cattoni, Fahim Dalvi, Nadir Durrani, Marcello Federico, Christian Federmann, Jiatao Gu, Fei Huang, Kevin Knight, Xutai Ma, Ajay Nagesh, Matteo Negri, Jan Niehues, Juan Pino, Elizabeth Salesky, Xing Shi, Sebastian St{\"u}ker, Marco Turchi, Alex Waibel, er, Changhan Wang

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge tracks: (i) Simultaneous speech translation, (ii) Video speech translation, (iii) Offline speech translation, (iv) Conversational speech translation, (v) Open domain translation, and (vi) Non-native speech translation.

Translation

Paper
Add Code

Self-Training for End-to-End Speech Translation

no code implementations • 3 Jun 2020 • Juan Pino, Qiantong Xu, Xutai Ma, Mohammad Javad Dousti, Yun Tang

One of the main challenges for end-to-end speech translation is data scarcity.

speech-recognition Speech Recognition +1

Paper
Add Code

Monotonic Multihead Attention

3 code implementations • ICLR 2020 • Xutai Ma, Juan Pino, James Cross, Liezl Puzon, Jiatao Gu

Simultaneous machine translation models start generating a target sequence before they have encoded or read the source sequence.

Decoder Machine Translation +1

29,318

Paper
Code

Harnessing Indirect Training Data for End-to-End Automatic Speech Translation: Tricks of the Trade

no code implementations • EMNLP (IWSLT) 2019 • Juan Pino, Liezl Puzon, Jiatao Gu, Xutai Ma, Arya D. McCarthy, Deepak Gopinath

In this work, we evaluate several data augmentation and pretraining approaches for AST, by comparing all on the same datasets.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Broad-Coverage Semantic Parsing as Transduction

no code implementations • IJCNLP 2019 • Sheng Zhang, Xutai Ma, Kevin Duh, Benjamin Van Durme

We unify different broad-coverage semantic parsing tasks under a transduction paradigm, and propose an attention-based neural framework that incrementally builds a meaning representation via a sequence of semantic relations.

Ranked #2 on UCCA Parsing on SemEval 2019 Task 1

AMR Parsing UCCA Parsing

Paper
Add Code

Robust Document Representations for Cross-Lingual Information Retrieval in Low-Resource Settings

no code implementations • WS 2019 • Mahsa Yarmohammadi, Xutai Ma, Sorami Hisamoto, Muhammad Rahman, Yiming Wang, Hainan Xu, Daniel Povey, Philipp Koehn, Kevin Duh

Cross-Lingual Information Retrieval Retrieval

Paper
Add Code

AMR Parsing as Sequence-to-Graph Transduction

1 code implementation • ACL 2019 • Sheng Zhang, Xutai Ma, Kevin Duh, Benjamin Van Durme

Our experimental results outperform all previously reported SMATCH scores, on both AMR 2. 0 (76. 3% F1 on LDC2017T10) and AMR 1. 0 (70. 2% F1 on LDC2014T12).

Ranked #1 on AMR Parsing on LDC2014T12:

AMR Parsing

154

Paper
Code

Cross-lingual Decompositional Semantic Parsing

no code implementations • EMNLP 2018 • Sheng Zhang, Xutai Ma, Rachel Rudinger, Kevin Duh, Benjamin Van Durme

We introduce the task of cross-lingual decompositional semantic parsing: mapping content provided in a source language into a decompositional semantic analysis based on a target language.

Semantic Parsing

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.