Search Results for author: Wang Geng

Found 4 papers, 1 papers with code

Boosting Code-Switching ASR with Mixture of Experts Enhanced Speech-Conditioned LLM

no code implementations24 Sep 2024 Fengrun Zhang, Wang Geng, Hukai Huang, Yahui Shan, Cheng Yi, He Qu

To further enhance the collaboration of multiple experts and leverage the understanding capabilities of LLM, we propose a two-stage progressive training strategy: 1) The connector is unfrozen and trained with language-specialized experts to map speech representations to the text space.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing

no code implementations11 Aug 2024 Chunyu Qiang, Wang Geng, Yi Zhao, Ruibo Fu, Tao Wang, Cheng Gong, Tianrui Wang, Qiuyu Liu, Jiangyan Yi, Zhengqi Wen, Chen Zhang, Hao Che, Longbiao Wang, Jianwu Dang, JianHua Tao

For tasks such as text-to-speech (TTS), voice conversion (VC), and automatic speech recognition (ASR), a cross-modal fine-grained (frame-level) sequence representation is desired, emphasizing the semantic content of the text modality while de-emphasizing the paralinguistic information of the speech modality.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Dynamic Multi-scale Convolution for Dialect Identification

1 code implementation2 Aug 2021 Tianlong Kong, Shouyi Yin, Dawei Zhang, Wang Geng, Xin Wang, Dandan song, Jinwen Huang, Huiyu Shi, Xiaorui Wang

To address this issue, we propose a new architecture, named dynamic multi-scale convolution, which consists of dynamic kernel convolution, local multi-scale learning, and global multi-scale pooling.

Dialect Identification

Cannot find the paper you are looking for? You can Submit a new open access paper.