Search Results for author: Jiatong Shi

Found 18 papers, 8 papers with code

End-to-End Automatic Speech Recognition: Its Impact on the Workflowin Documenting Yoloxóchitl Mixtec

no code implementations NAACL (AmericasNLP) 2021 Jonathan D. Amith, Jiatong Shi, Rey Castillo García

This paper describes three open access Yoloxóchitl Mixtec corpora and presents the results and implications of end-to-end automatic speech recognition for endangered language documentation.

Automatic Speech Recognition

Findings of the IWSLT 2022 Evaluation Campaign

no code implementations IWSLT (ACL) 2022 Antonios Anastasopoulos, Loïc Barrault, Luisa Bentivogli, Marcely Zanon Boito, Ondřej Bojar, Roldano Cattoni, Anna Currey, Georgiana Dinu, Kevin Duh, Maha Elbayad, Clara Emmanuel, Yannick Estève, Marcello Federico, Christian Federmann, Souhir Gahbiche, Hongyu Gong, Roman Grundkiewicz, Barry Haddow, Benjamin Hsu, Dávid Javorský, Vĕra Kloudová, Surafel Lakew, Xutai Ma, Prashant Mathur, Paul McNamee, Kenton Murray, Maria Nǎdejde, Satoshi Nakamura, Matteo Negri, Jan Niehues, Xing Niu, John Ortega, Juan Pino, Elizabeth Salesky, Jiatong Shi, Matthias Sperber, Sebastian Stüker, Katsuhito Sudoh, Marco Turchi, Yogesh Virkar, Alexander Waibel, Changhan Wang, Shinji Watanabe

The evaluation campaign of the 19th International Conference on Spoken Language Translation featured eight shared tasks: (i) Simultaneous speech translation, (ii) Offline speech translation, (iii) Speech to speech translation, (iv) Low-resource speech translation, (v) Multilingual speech translation, (vi) Dialect speech translation, (vii) Formality control for speech translation, (viii) Isometric speech translation.

Speech-to-Speech Translation Translation

CMU’s IWSLT 2022 Dialect Speech Translation System

no code implementations IWSLT (ACL) 2022 Brian Yan, Patrick Fernandes, Siddharth Dalmia, Jiatong Shi, Yifan Peng, Dan Berrebbi, Xinyi Wang, Graham Neubig, Shinji Watanabe

We use additional paired Modern Standard Arabic data (MSA) to directly improve the speech recognition (ASR) and machine translation (MT) components of our cascaded systems.

Knowledge Distillation Machine Translation +2

Blockwise Streaming Transformer for Spoken Language Understanding and Simultaneous Speech Translation

no code implementations19 Apr 2022 Keqi Deng, Shinji Watanabe, Jiatong Shi, Siddhant Arora

Although Transformers have gained success in several speech processing tasks like spoken language understanding (SLU) and speech translation (ST), achieving online processing while keeping competitive performance is still essential for real-world interaction.

Automatic Speech Recognition Spoken Language Understanding +1

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation

1 code implementation5 Apr 2022 Dan Berrebbi, Jiatong Shi, Brian Yan, Osbel Lopez-Francisco, Jonathan D. Amith, Shinji Watanabe

The present work examines the assumption that combining non-learnable SF extractors to SSL models is an effective approach to low resource speech tasks.

Automatic Speech Recognition Self-Supervised Learning +1

SingAug: Data Augmentation for Singing Voice Synthesis with Cycle-consistent Training Strategy

no code implementations31 Mar 2022 Shuai Guo, Jiatong Shi, Tao Qian, Shinji Watanabe, Qin Jin

Deep learning based singing voice synthesis (SVS) systems have been demonstrated to flexibly generate singing with better qualities, compared to conventional statistical parametric based methods.

Data Augmentation

Cross-lingual Transfer for Speech Processing using Acoustic Language Similarity

1 code implementation2 Nov 2021 Peter Wu, Jiatong Shi, Yifan Zhong, Shinji Watanabe, Alan W Black

We demonstrate the effectiveness of our approach in language family classification, speech recognition, and speech synthesis tasks.

Cross-Lingual Transfer Speech Recognition +1

Leveraging End-to-End ASR for Endangered Language Documentation: An Empirical Study on Yol\'oxochitl Mixtec

no code implementations EACL 2021 Jiatong Shi, Jonathan D. Amith, Rey Castillo Garc{\'\i}a, Esteban Guadalupe Sierra, Kevin Duh, Shinji Watanabe

{``}Transcription bottlenecks{''}, created by a shortage of effective human transcribers (i. e., transcriber shortage), are one of the main challenges to endangered language (EL) documentation.

Automatic Speech Recognition

Leveraging End-to-End ASR for Endangered Language Documentation: An Empirical Study on Yoloxóchitl Mixtec

no code implementations26 Jan 2021 Jiatong Shi, Jonathan D. Amith, Rey Castillo García, Esteban Guadalupe Sierra, Kevin Duh, Shinji Watanabe

"Transcription bottlenecks", created by a shortage of effective human transcribers are one of the main challenges to endangered language (EL) documentation.

Automatic Speech Recognition

Improving RNN Transducer With Target Speaker Extraction and Neural Uncertainty Estimation

no code implementations26 Nov 2020 Jiatong Shi, Chunlei Zhang, Chao Weng, Shinji Watanabe, Meng Yu, Dong Yu

Target-speaker speech recognition aims to recognize target-speaker speech from noisy environments with background noise and interfering speakers.

Speech Enhancement Speech Extraction +1 Sound Audio and Speech Processing

Large-Scale End-to-End Multilingual Speech Recognition and Language Identification with Multi-Task Learning

1 code implementation25 Oct 2020 Wenxin Hou, Yue Dong, Bairong Zhuang, Longfei Yang, Jiatong Shi, Takahiro Shinozaki

In this paper, we report a large-scale end-to-end language-independent multilingual model for joint automatic speech recognition (ASR) and language identification (LID).

Automatic Speech Recognition Language Identification +1

Sequence-to-sequence Singing Voice Synthesis with Perceptual Entropy Loss

1 code implementation22 Oct 2020 Jiatong Shi, Shuai Guo, Nan Huo, Yuekai Zhang, Qin Jin

The neural network (NN) based singing voice synthesis (SVS) systems require sufficient data to train well and are prone to over-fitting due to data scarcity.

Cannot find the paper you are looking for? You can Submit a new open access paper.