no code implementations • 21 Jan 2025 • Minh Tran, Yutong Pang, Debjyoti Paul, Laxmi Pandey, Kevin Jiang, Jinxi Guo, Ke Li, Shun Zhang, Xuedong Zhang, Xin Lei
We introduce DAS (Domain Adaptation with Synthetic data), a novel domain adaptation framework for pre-trained ASR model, designed to efficiently adapt to various language-defined domains without requiring any real data.
no code implementations • 9 Aug 2024 • Yutong Pang, Debjyoti Paul, Kevin Jiang, Xuedong Zhang, Xin Lei
This paper introduces two advancements in the field of Large Language Model Annotation with a focus on punctuation restoration tasks.
no code implementations • 23 Jul 2024 • Laxmi Pandey, Ke Li, Jinxi Guo, Debjyoti Paul, Arthur Guo, Jay Mahadeokar, Xuedong Zhang
Multilingual pretraining for transfer learning significantly boosts the robustness of low-resource monolingual ASR models.
no code implementations • 30 Aug 2023 • Sen Fang, Chunyu Sui, Yanghao Zhou, Xuedong Zhang, Hongbin Zhong, Minyu Zhao, Yapeng Tian, Chen Chen
In this paper, we propose a dual-condition diffusion pre-training model named SignDiff that can generate human sign language speakers from a skeleton pose.
no code implementations • 20 Jan 2023 • Szu-Jui Chen, Debjyoti Paul, Yutong Pang, Peng Su, Xuedong Zhang
With the emergence of automatic speech recognition (ASR) models, converting the spoken form text (from ASR) to the written form is in urgent need.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 9 Nov 2022 • Yingyi Ma, Zhe Liu, Xuedong Zhang
Thus, the data sampling strategy is important to the adaptation performance.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
no code implementations • 13 Oct 2022 • Zhe Liu, Xuedong Zhang, Fuchun Peng
Recent research has shown that language models have a tendency to memorize rare or unique sequences in the training corpora which can thus leak sensitive attributes of user data.
no code implementations • 20 Jul 2022 • Laxmi Pandey, Debjyoti Paul, Pooja Chitkara, Yutong Pang, Xuedong Zhang, Kjell Schubert, Mark Chou, Shu Liu, Yatharth Saraf
Inverse text normalization (ITN) is used to convert the spoken form output of an automatic speech recognition (ASR) system to a written form.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 20 Nov 2017 • Chung-Cheng Chiu, Anshuman Tripathi, Katherine Chou, Chris Co, Navdeep Jaitly, Diana Jaunzeikare, Anjuli Kannan, Patrick Nguyen, Hasim Sak, Ananth Sankar, Justin Tansuwan, Nathan Wan, Yonghui Wu, Xuedong Zhang
We explored both CTC and LAS systems for building speech recognition models.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1