1 code implementation • 14 Sep 2023 • Mu Yang, Naoyuki Kanda, Xiaofei Wang, Junkun Chen, Peidong Wang, Jian Xue, Jinyu Li, Takuya Yoshioka
End-to-end speech translation (ST) for conversation recordings involves several under-explored challenges such as speaker diarization (SD) without accurate word time stamps and handling of overlapping speech in a streaming fashion.
no code implementations • 10 Jun 2023 • Mu Yang, Ram C. M. C. Shekar, Okim Kang, John H. L. Hansen
This study is focused on understanding and quantifying the change in phoneme and prosody information encoded in the Self-Supervised Learning (SSL) model, brought by an accent identification (AID) fine-tuning task.
no code implementations • 13 Sep 2022 • Mu Yang, Andros Tjandra, Chunxi Liu, David Zhang, Duc Le, Ozlem Kalinli
Neural network pruning compresses automatic speech recognition (ASR) models effectively.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 29 Mar 2022 • Mu Yang, Kevin Hirschi, Stephen D. Looney, Okim Kang, John H. L. Hansen
We show that fine-tuning with pseudo labels achieves a 5. 35% phoneme error rate reduction and 2. 48% MDD F1 score improvement over a labeled-samples-only fine-tuning baseline.
1 code implementation • 31 Dec 2021 • Chin-Tung Lin, Mu Yang
We conduct experiments with human evaluation on VMT, SeqSeq model (our baseline), and the original piano version soundtrack.
1 code implementation • 9 Oct 2021 • Mu Yang, Shaojin Ding, Tianlong Chen, Tong Wang, Zhangyang Wang
This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system, where each language was seen as an individual task and was learned sequentially and continually.
1 code implementation • NAACL 2021 • Mingyu Derek Ma, Jiao Sun, Mu Yang, Kung-Hsiang Huang, Nuan Wen, Shikhar Singh, Rujun Han, Nanyun Peng
We present EventPlus, a temporal event understanding pipeline that integrates various state-of-the-art event understanding components including event trigger and type detection, event argument detection, event duration and temporal relation extraction.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Kung-Hsiang Huang, Mu Yang, Nanyun Peng
To better recognize the trigger words, each sentence is first grounded to a sentence graph based on a jointly modeled hierarchical knowledge graph from UMLS.
Ranked #2 on Event Extraction on GENIA
no code implementations • LREC 2020 • Mu Yang, Chi-Yen Chen, Yi-Hui Lee, Qian-hui Zeng, Wei-Yun Ma, Chen-Yang Shih, Wei-Jhih Chen
In this paper, we design headword-oriented entity linking (HEL), a specialized entity linking problem in which only the headwords of the entities are to be linked to knowledge bases; mention scopes of the entities do not need to be identified in the problem setting.
1 code implementation • CONLL 2019 • Rujun Han, I-Hung Hsu, Mu Yang, Aram Galstyan, Ralph Weischedel, Nanyun Peng
We propose a novel deep structured learning framework for event temporal relation extraction.
1 code implementation • 7 Apr 2019 • Prashanth Gurunath Shivakumar, Mu Yang, Panayiotis Georgiou
In this paper, we address the spoken language intent detection under noisy conditions imposed by automatic speech recognition (ASR) systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 19 Sep 2018 • Mu Yang, Zheng-Hao Liu, Ze-Di Cheng, Jin-Shi Xu, Chuan-Feng Li, Guang-Can Guo
A well-trained deep neural network is shown to gain capability of simultaneously restoring two kinds of images, which are completely destroyed by two distinct scattering medias respectively.