Search Results for author: Qingyang Hong

Found 13 papers, 1 papers with code

MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis

no code implementations17 Dec 2023 Wenhao Guan, Yishuang Li, Tao Li, Hukai Huang, Feng Wang, Jiayan Lin, Lingyan Huang, Lin Li, Qingyang Hong

The challenges of modeling such a multi-modal style controllable TTS mainly lie in two aspects:1)aligning the multi-modal information into a unified style space to enable the input of arbitrary modality as the style prompt in a single system, and 2)efficiently transferring the unified style representation into the given text content, thereby empowering the ability to generate prompt style-related voice.

Speech Synthesis Style Transfer +1

Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization

no code implementations26 Jun 2023 Jie Wang, Zhicong Chen, Haodong Zhou, Lin Li, Qingyang Hong

The CDGCN-based clustering method consists of graph generation, sub-graph detection, and Graph-based Overlapped Speech Detection (Graph-OSD).

Clustering Community Detection +3

Towards A Unified Conformer Structure: from ASR to ASV Task

1 code implementation14 Nov 2022 Dexin Liao, Tao Jiang, Feng Wang, Lin Li, Qingyang Hong

Transformer has achieved extraordinary performance in Natural Language Processing and Computer Vision tasks thanks to its powerful self-attention mechanism, and its variant Conformer has become a state-of-the-art architecture in the field of Automatic Speech Recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Deep Representation Decomposition for Rate-Invariant Speaker Verification

no code implementations28 May 2022 Fuchuan Tong, Siqi Zheng, Haodong Zhou, Xingjia Xie, Qingyang Hong, Lin Li

While promising performance for speaker verification has been achieved by deep speaker embeddings, the advantage would reduce in the case of speaking-style variability.

Speaker Verification

XMUSPEECH System for VoxCeleb Speaker Recognition Challenge 2021

no code implementations6 Sep 2021 Jie Wang, Fuchuang Tong, Zhicong Chen, Lin Li, Qingyang Hong, Haodong Zhou

This paper describes the XMUSPEECH speaker recognition and diarisation systems for the VoxCeleb Speaker Recognition Challenge 2021.

Speaker Recognition

OLR 2021 Challenge: Datasets, Rules and Baselines

no code implementations23 Jul 2021 Binling Wang, Wenxuan Hu, Jing Li, Yiming Zhi, Zheng Li, Qingyang Hong, Lin Li, Dong Wang, Liming Song, Cheng Yang

In addition to the Language Identification (LID) tasks, multilingual Automatic Speech Recognition (ASR) tasks are introduced to OLR 2021 Challenge for the first time.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Oriental Language Recognition (OLR) 2020: Summary and Analysis

no code implementations5 Jul 2021 Jing Li, Binling Wang, Yiming Zhi, Zheng Li, Lin Li, Qingyang Hong, Dong Wang

The fifth Oriental Language Recognition (OLR) Challenge focuses on language recognition in a variety of complex environments to promote its development.

Dialect Identification valid

An Integrated Framework for Two-pass Personalized Voice Trigger

no code implementations30 Jun 2021 Dexin Liao, Jing Li, Yiming Zhi, Song Li, Qingyang Hong, Lin Li

For the SV system, we proposed a multi-task learning network, where phonetic branch is trained with the character label of the utterance, and speaker branch is trained with the label of the speaker.

Keyword Spotting Multi-Task Learning +2

Phoneme-aware and Channel-wise Attentive Learning for Text DependentSpeaker Verification

no code implementations25 Jun 2021 Yan Liu, Zheng Li, Lin Li, Qingyang Hong

This paper proposes a multi-task learning network with phoneme-aware and channel-wise attentive learning strategies for text-dependent Speaker Verification (SV).

Multi-Task Learning Text-Dependent Speaker Verification

AP20-OLR Challenge: Three Tasks and Their Baselines

no code implementations4 Jun 2020 Zheng Li, Miao Zhao, Qingyang Hong, Lin Li, Zhiyuan Tang, Dong Wang, Li-Ming Song, Cheng Yang

Based on Kaldi and Pytorch, recipes for i-vector and x-vector systems are also conducted as baselines for the three tasks.

Dialect Identification

Cannot find the paper you are looking for? You can Submit a new open access paper.