Search Results for author: Liqun Deng

Found 11 papers, 4 papers with code

SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech

no code implementations3 Jul 2024 Jingru Lin, Meng Ge, Junyi Ao, Liqun Deng, Haizhou Li

Specifically, SA-WavLM follows an "extract-merge-predict" pipeline in which the representations of each speaker in the input mixture are first extracted individually and then merged before the final prediction.

Self-Supervised Learning

Prompt-driven Target Speech Diarization

no code implementations23 Oct 2023 Yidi Jiang, Zhengyang Chen, Ruijie Tao, Liqun Deng, Yanmin Qian, Haizhou Li

We introduce a novel task named `target speech diarization', which seeks to determine `when target event occurred' within an audio signal.

Action Detection Activity Detection

DisCover: Disentangled Music Representation Learning for Cover Song Identification

no code implementations19 Jul 2023 Jiahao Xun, Shengyu Zhang, Yanting Yang, Jieming Zhu, Liqun Deng, Zhou Zhao, Zhenhua Dong, RuiQi Li, Lichao Zhang, Fei Wu

We analyze the CSI task in a disentanglement view with the causal graph technique, and identify the intra-version and inter-version effects biasing the invariant learning.

Blocking Cover song identification +3

Reducing language context confusion for end-to-end code-switching automatic speech recognition

no code implementations28 Jan 2022 Shuai Zhang, Jiangyan Yi, Zhengkun Tian, JianHua Tao, Yu Ting Yeung, Liqun Deng

We propose a language-related attention mechanism to reduce multilingual context confusion for the E2E code-switching ASR model based on the Equivalence Constraint (EC) Theory.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion

3 code implementations4 Jul 2021 Daxin Tan, Liqun Deng, Yu Ting Yeung, Xin Jiang, Xiao Chen, Tan Lee

This paper presents the design, implementation and evaluation of a speech editing system, named EditSpeech, which allows a user to perform deletion, insertion and replacement of words in a given speech utterance, without causing audible degradation in speech quality and naturalness.

Text to Speech

VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-shot Voice Conversion

1 code implementation18 Jun 2021 Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng

One-shot voice conversion (VC), which performs conversion across arbitrary speakers with only a single target-speaker utterance for reference, can be effectively achieved by speech representation disentanglement.

Disentanglement Quantization +1

Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization

no code implementations18 Jun 2021 Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng

Such systems are particularly susceptible to domain mismatch where the training and testing data come from the source and target domains respectively, but the two domains may differ in terms of speech stimuli, disease etiology, etc.

Multi-Task Learning Unsupervised Domain Adaptation

Unified Mandarin TTS Front-end Based on Distilled BERT Model

1 code implementation31 Dec 2020 Yang Zhang, Liqun Deng, Yasheng Wang

The front-end module in a typical Mandarin text-to-speech system (TTS) is composed of a long pipeline of text processing components, which requires extensive efforts to build and is prone to large accumulative model size and cascade errors.

Knowledge Distillation Language Modelling +2

Cannot find the paper you are looking for? You can Submit a new open access paper.