Search Results for author: Qingyang Hong

Found 13 papers, 1 papers with code

MM-TTS: Multi-modal Prompt based Style Transfer for Expressive Text-to-Speech Synthesis

no code implementations • 17 Dec 2023 • Wenhao Guan, Yishuang Li, Tao Li, Hukai Huang, Feng Wang, Jiayan Lin, Lingyan Huang, Lin Li, Qingyang Hong

The challenges of modeling such a multi-modal style controllable TTS mainly lie in two aspects:1)aligning the multi-modal information into a unified style space to enable the input of arbitrary modality as the style prompt in a single system, and 2)efficiently transferring the unified style representation into the given text content, thereby empowering the ability to generate prompt style-related voice.

Speech Synthesis Style Transfer +1

Paper
Add Code

Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization

no code implementations • 26 Jun 2023 • Jie Wang, Zhicong Chen, Haodong Zhou, Lin Li, Qingyang Hong

The CDGCN-based clustering method consists of graph generation, sub-graph detection, and Graph-based Overlapped Speech Detection (Graph-OSD).

Clustering Community Detection +3

Paper
Add Code

Towards A Unified Conformer Structure: from ASR to ASV Task

1 code implementation • 14 Nov 2022 • Dexin Liao, Tao Jiang, Feng Wang, Lin Li, Qingyang Hong

Transformer has achieved extraordinary performance in Natural Language Processing and Computer Vision tasks thanks to its powerful self-attention mechanism, and its variant Conformer has become a state-of-the-art architecture in the field of Automatic Speech Recognition (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

573

Paper
Code

Spatial-aware Speaker Diarization for Multi-channel Multi-party Meeting

no code implementations • 24 Sep 2022 • Jie Wang, Yuji Liu, Binling Wang, Yiming Zhi, Song Li, Shipeng Xia, Jiayang Zhang, Feng Tong, Lin Li, Qingyang Hong

This paper describes a spatial-aware speaker diarization system for the multi-channel multi-party meeting.

speaker-diarization Speaker Diarization

Paper
Add Code

Deep Representation Decomposition for Rate-Invariant Speaker Verification

no code implementations • 28 May 2022 • Fuchuan Tong, Siqi Zheng, Haodong Zhou, Xingjia Xie, Qingyang Hong, Lin Li

While promising performance for speaker verification has been achieved by deep speaker embeddings, the advantage would reduce in the case of speaking-style variability.

Speaker Verification

Paper
Add Code

Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data

no code implementations • 25 Apr 2022 • Fuchuan Tong, Siqi Zheng, Min Zhang, Yafeng Chen, Hongbin Suo, Qingyang Hong, Lin Li

In this work, we present a GCN-based approach for semi-supervised learning.

Clustering Speaker Recognition

Paper
Add Code

The xmuspeech system for multi-channel multi-party meeting transcription challenge

no code implementations • 11 Feb 2022 • Jie Wang, Yuji Liu, Binling Wang, Yiming Zhi, Song Li1, Shipeng Xia, Jiayang Zhang, Lin Li1, Qingyang Hong, Feng Tong

By performing DMSNet based OSD module, the DER of cluster-based diarization system decrease significantly form 13. 44% to 7. 63%.

speaker-diarization Speaker Diarization

Paper
Add Code

XMUSPEECH System for VoxCeleb Speaker Recognition Challenge 2021

no code implementations • 6 Sep 2021 • Jie Wang, Fuchuang Tong, Zhicong Chen, Lin Li, Qingyang Hong, Haodong Zhou

This paper describes the XMUSPEECH speaker recognition and diarisation systems for the VoxCeleb Speaker Recognition Challenge 2021.

Speaker Recognition

Paper
Add Code

OLR 2021 Challenge: Datasets, Rules and Baselines

no code implementations • 23 Jul 2021 • Binling Wang, Wenxuan Hu, Jing Li, Yiming Zhi, Zheng Li, Qingyang Hong, Lin Li, Dong Wang, Liming Song, Cheng Yang

In addition to the Language Identification (LID) tasks, multilingual Automatic Speech Recognition (ASR) tasks are introduced to OLR 2021 Challenge for the first time.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Oriental Language Recognition (OLR) 2020: Summary and Analysis

no code implementations • 5 Jul 2021 • Jing Li, Binling Wang, Yiming Zhi, Zheng Li, Lin Li, Qingyang Hong, Dong Wang

The fifth Oriental Language Recognition (OLR) Challenge focuses on language recognition in a variety of complex environments to promote its development.

Dialect Identification valid

Paper
Add Code

An Integrated Framework for Two-pass Personalized Voice Trigger

no code implementations • 30 Jun 2021 • Dexin Liao, Jing Li, Yiming Zhi, Song Li, Qingyang Hong, Lin Li

For the SV system, we proposed a multi-task learning network, where phonetic branch is trained with the character label of the utterance, and speaker branch is trained with the label of the speaker.

Keyword Spotting Multi-Task Learning +2

Paper
Add Code

Phoneme-aware and Channel-wise Attentive Learning for Text DependentSpeaker Verification

no code implementations • 25 Jun 2021 • Yan Liu, Zheng Li, Lin Li, Qingyang Hong

This paper proposes a multi-task learning network with phoneme-aware and channel-wise attentive learning strategies for text-dependent Speaker Verification (SV).

Multi-Task Learning Text-Dependent Speaker Verification

Paper
Add Code

AP20-OLR Challenge: Three Tasks and Their Baselines

no code implementations • 4 Jun 2020 • Zheng Li, Miao Zhao, Qingyang Hong, Lin Li, Zhiyuan Tang, Dong Wang, Li-Ming Song, Cheng Yang

Based on Kaldi and Pytorch, recipes for i-vector and x-vector systems are also conducted as baselines for the three tasks.

Dialect Identification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.