Search Results for author: Mu Yang

Found 12 papers, 8 papers with code

DiariST: Streaming Speech Translation with Speaker Diarization

1 code implementation14 Sep 2023 Mu Yang, Naoyuki Kanda, Xiaofei Wang, Junkun Chen, Peidong Wang, Jian Xue, Jinyu Li, Takuya Yoshioka

End-to-end speech translation (ST) for conversation recordings involves several under-explored challenges such as speaker diarization (SD) without accurate word time stamps and handling of overlapping speech in a streaming fashion.

speaker-diarization Speaker Diarization +3

What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification Model

no code implementations10 Jun 2023 Mu Yang, Ram C. M. C. Shekar, Okim Kang, John H. L. Hansen

This study is focused on understanding and quantifying the change in phoneme and prosody information encoded in the Self-Supervised Learning (SSL) model, brought by an accent identification (AID) fine-tuning task.

Automatic Speech Recognition Prosody Prediction +3

Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment

1 code implementation29 Mar 2022 Mu Yang, Kevin Hirschi, Stephen D. Looney, Okim Kang, John H. L. Hansen

We show that fine-tuning with pseudo labels achieves a 5. 35% phoneme error rate reduction and 2. 48% MDD F1 score improvement over a labeled-samples-only fine-tuning baseline.

Pseudo Label Self-Supervised Learning

InverseMV: Composing Piano Scores with a Convolutional Video-Music Transformer

1 code implementation31 Dec 2021 Chin-Tung Lin, Mu Yang

We conduct experiments with human evaluation on VMT, SeqSeq model (our baseline), and the original piano version soundtrack.

Music Generation

Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis

1 code implementation9 Oct 2021 Mu Yang, Shaojin Ding, Tianlong Chen, Tong Wang, Zhangyang Wang

This work presents a lifelong learning approach to train a multilingual Text-To-Speech (TTS) system, where each language was seen as an individual task and was learned sequentially and continually.

Speech Synthesis Text-To-Speech Synthesis

EventPlus: A Temporal Event Understanding Pipeline

1 code implementation NAACL 2021 Mingyu Derek Ma, Jiao Sun, Mu Yang, Kung-Hsiang Huang, Nuan Wen, Shikhar Singh, Rujun Han, Nanyun Peng

We present EventPlus, a temporal event understanding pipeline that integrates various state-of-the-art event understanding components including event trigger and type detection, event argument detection, event duration and temporal relation extraction.

Common Sense Reasoning Event Extraction +1

Biomedical Event Extraction with Hierarchical Knowledge Graphs

1 code implementation Findings of the Association for Computational Linguistics 2020 Kung-Hsiang Huang, Mu Yang, Nanyun Peng

To better recognize the trigger words, each sentence is first grounded to a sentence graph based on a jointly modeled hierarchical knowledge graph from UMLS.

Event Extraction Sentence

Headword-Oriented Entity Linking: A Special Entity Linking Task with Dataset and Baseline

no code implementations LREC 2020 Mu Yang, Chi-Yen Chen, Yi-Hui Lee, Qian-hui Zeng, Wei-Yun Ma, Chen-Yang Shih, Wei-Jhih Chen

In this paper, we design headword-oriented entity linking (HEL), a specialized entity linking problem in which only the headwords of the entities are to be linked to knowledge bases; mention scopes of the entities do not need to be identified in the problem setting.

Entity Linking Transfer Learning

Spoken Language Intent Detection using Confusion2Vec

1 code implementation7 Apr 2019 Prashanth Gurunath Shivakumar, Mu Yang, Panayiotis Georgiou

In this paper, we address the spoken language intent detection under noisy conditions imposed by automatic speech recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Deep Hybrid Scattering Image Learning

no code implementations19 Sep 2018 Mu Yang, Zheng-Hao Liu, Ze-Di Cheng, Jin-Shi Xu, Chuan-Feng Li, Guang-Can Guo

A well-trained deep neural network is shown to gain capability of simultaneously restoring two kinds of images, which are completely destroyed by two distinct scattering medias respectively.

Cannot find the paper you are looking for? You can Submit a new open access paper.