no code implementations • WMT (EMNLP) 2021 • Minh Quang Pham, Josep Crego, Antoine Senellart, Dan Berrebbi, Jean Senellart
This paper describes SYSTRAN submissions to the WMT 2021 terminology shared task.
no code implementations • IWSLT (ACL) 2022 • Brian Yan, Patrick Fernandes, Siddharth Dalmia, Jiatong Shi, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, Shinji Watanabe
We use additional paired Modern Standard Arabic data (MSA) to directly improve the speech recognition (ASR) and machine translation (MT) components of our cascaded systems.
no code implementations • 9 Oct 2023 • Jiatong Shi, William Chen, Dan Berrebbi, Hsiu-Hsuan Wang, Wei-Ping Huang, En-Pei Hu, Ho-Lam Chuang, Xuankai Chang, Yuxun Tang, Shang-Wen Li, Abdelrahman Mohamed, Hung-Yi Lee, Shinji Watanabe
The 2023 Multilingual Speech Universal Performance Benchmark (ML-SUPERB) Challenge expands upon the acclaimed SUPERB framework, emphasizing self-supervised models in multilingual speech recognition and language identification.
no code implementations • 26 Sep 2023 • William Chen, Jiatong Shi, Brian Yan, Dan Berrebbi, Wangyou Zhang, Yifan Peng, Xuankai Chang, Soumi Maiti, Shinji Watanabe
We show that further efficiency can be achieved with a vanilla HuBERT Base model, which can maintain 94% of XLS-R's performance with only 3% of the data, 4 GPUs, and limited trials.
6 code implementations • 25 Sep 2023 • Yifan Peng, Jinchuan Tian, Brian Yan, Dan Berrebbi, Xuankai Chang, Xinjian Li, Jiatong Shi, Siddhant Arora, William Chen, Roshan Sharma, Wangyou Zhang, Yui Sudo, Muhammad Shakeel, Jee-weon Jung, Soumi Maiti, Shinji Watanabe
Pre-training speech models on large volumes of data has achieved remarkable success.
no code implementations • 18 May 2023 • Jiatong Shi, Dan Berrebbi, William Chen, Ho-Lam Chung, En-Pei Hu, Wei Ping Huang, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-Yi Lee, Shinji Watanabe
Speech processing Universal PERformance Benchmark (SUPERB) is a leaderboard to benchmark the performance of Self-Supervised Learning (SSL) models on various speech processing tasks.
1 code implementation • 10 Apr 2023 • Brian Yan, Jiatong Shi, Yun Tang, Hirofumi Inaguma, Yifan Peng, Siddharth Dalmia, Peter Polák, Patrick Fernandes, Dan Berrebbi, Tomoki Hayashi, Xiaohui Zhang, Zhaoheng Ni, Moto Hira, Soumi Maiti, Juan Pino, Shinji Watanabe
ESPnet-ST-v2 is a revamp of the open-source ESPnet-ST toolkit necessitated by the broadening interests of the spoken language translation community.
no code implementations • 2 Nov 2022 • Dan Berrebbi, Ronan Collobert, Navdeep Jaitly, Tatiana Likhomanenko
We perform a systematic analysis on both labeled and unlabeled data by varying the number of speakers while keeping the number of hours fixed and vice versa.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 1 Nov 2022 • Dan Berrebbi, Brian Yan, Shinji Watanabe
Although popular for classification tasks in vision and language, EE has seen less use for sequence-to-sequence speech recognition (ASR) tasks where outputs from early layers are often degenerate.
Self-Supervised Learning
Sequence-To-Sequence Speech Recognition
+1
no code implementations • 17 Oct 2022 • Dan Berrebbi, Ronan Collobert, Samy Bengio, Navdeep Jaitly, Tatiana Likhomanenko
Nevertheless, these approaches still rely on bootstrapping the ST using an initial supervised learning phase where the model is trained on labeled data alone.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 5 Apr 2022 • Dan Berrebbi, Jiatong Shi, Brian Yan, Osbel Lopez-Francisco, Jonathan D. Amith, Shinji Watanabe
The present work examines the assumption that combining non-learnable SF extractors to SSL models is an effective approach to low resource speech tasks.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 29 Nov 2021 • Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu
Conversational bilingual speech encompasses three types of utterances: two purely monolingual types and one intra-sententially code-switched type.