1 code implementation • 21 Sep 2024 • Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, KaiWei Chang, Jiawei Du, Ke-Han Lu, Alexander H. Liu, Ho-Lam Chung, Yuan-Kuei Wu, Dongchao Yang, Songxiang Liu, Yi-Chiao Wu, Xu Tan, James Glass, Shinji Watanabe, Hung-Yi Lee
Neural audio codec models are becoming increasingly important as they serve as tokenizers for audio, enabling efficient transmission or facilitating speech language modeling.
no code implementations • 27 Feb 2024 • Corby Rosset, Ho-Lam Chung, Guanghui Qin, Ethan C. Chau, Zhuo Feng, Ahmed Awadallah, Jennifer Neville, Nikhil Rao
We show that users spend a lot of ``effort'' on these questions in terms of signals like clicks and session length, and that they are also challenging for GPT-4.
no code implementations • 20 Feb 2024 • Haibin Wu, Xuanjun Chen, Yi-Cheng Lin, Kai-Wei Chang, Ho-Lam Chung, Alexander H. Liu, Hung-Yi Lee
Neural audio codecs are initially introduced to compress audio data into compact codes to reduce transmission latency.
1 code implementation • 20 Feb 2024 • Haibin Wu, Ho-Lam Chung, Yi-Cheng Lin, Yuan-Kuei Wu, Xuanjun Chen, Yu-Chi Pai, Hsiu-Hsuan Wang, Kai-Wei Chang, Alexander H. Liu, Hung-Yi Lee
The sound codec's dual roles in minimizing data transmission latency and serving as tokenizers underscore its critical importance.
no code implementations • 15 Dec 2023 • Min-Han Shih, Ho-Lam Chung, Yu-Chi Pai, Ming-Hao Hsu, Guan-Ting Lin, Shang-Wen Li, Hung-Yi Lee
Furthermore, the GSQA model has only been fine-tuned on the spoken extractive QA dataset.
no code implementations • 25 Sep 2023 • Chun-Yi Kuan, Chen An Li, Tsu-Yuan Hsu, Tse-Yang Lin, Ho-Lam Chung, Kai-Wei Chang, Shuo-Yiin Chang, Hung-Yi Lee
This paper introduces a novel voice conversion (VC) model, guided by text instructions such as "articulate slowly with a deep tone" or "speak in a cheerful boyish voice".
no code implementations • 18 May 2023 • Jiatong Shi, Dan Berrebbi, William Chen, Ho-Lam Chung, En-Pei Hu, Wei Ping Huang, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-Yi Lee, Shinji Watanabe
Speech processing Universal PERformance Benchmark (SUPERB) is a leaderboard to benchmark the performance of Self-Supervised Learning (SSL) models on various speech processing tasks.
1 code implementation • 1 Nov 2022 • Chan-Jan Hsu, Ho-Lam Chung, Hung-Yi Lee, Yu Tsao
In Spoken language understanding (SLU), a natural solution is concatenating pre-trained speech models (e. g. HuBERT) and pretrained language models (PLM, e. g. T5).
1 code implementation • 9 Mar 2022 • Guan-Ting Lin, Yung-Sung Chuang, Ho-Lam Chung, Shu-wen Yang, Hsuan-Jui Chen, Shuyan Dong, Shang-Wen Li, Abdelrahman Mohamed, Hung-Yi Lee, Lin-shan Lee
We empirically showed that DUAL yields results comparable to those obtained by cascading ASR and text QA model and robust to real-world data.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 2 Dec 2021 • Ying-Hong Chan, Ho-Lam Chung, Yao-Chung Fan
While the significant advancement of QG techniques was reported, current QG results are not ideal for educational reading practice assessment in terms of \textit{controllability} and \textit{question difficulty}.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Ho-Lam Chung, Ying-Hong Chan, Yao-Chung Fan
In this paper, we investigate the following two limitations for the existing distractor generation (DG) methods.
1 code implementation • 12 Oct 2020 • Ho-Lam Chung, Ying-Hong Chan, Yao-Chung Fan
In this paper, we investigate the following two limitations for the existing distractor generation (DG) methods.
Ranked #1 on
Distractor Generation
on RACE