Search Results for author: Xucheng Wan

Found 8 papers, 0 papers with code

XCB: an effective contextual biasing approach to bias cross-lingual phrases in speech recognition

no code implementations20 Aug 2024 Xucheng Wan, Naijun Zheng, Kai Liu, Huan Zhou

Contextualized ASR models have been demonstrated to effectively improve the recognition accuracy of uncommon phrases when a predefined phrase list is available.

speech-recognition Speech Recognition

An efficient text augmentation approach for contextualized Mandarin speech recognition

no code implementations14 Jun 2024 Naijun Zheng, Xucheng Wan, Kai Liu, Ziqing Du, Zhou Huan

Although contextualized automatic speech recognition (ASR) systems are commonly used to improve the recognition of uncommon words, their effectiveness is hindered by the inherent limitations of speech-text data availability.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition

no code implementations6 May 2024 Bingshen Mu, Yangze Li, Qijie Shao, Kun Wei, Xucheng Wan, Naijun Zheng, Huan Zhou, Lei Xie

Accents represent deviations from standard pronunciation norms, and the multi-task learning framework for simultaneous ASR and accent recognition (AR) has effectively addressed the multi-accent scenarios, making it a prominent solution.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder

no code implementations8 Apr 2024 He Wang, Pengcheng Guo, Xucheng Wan, Huan Zhou, Lei Xie

Automatic lip-reading (ALR) aims to automatically transcribe spoken content from a speaker's silent lip motion captured in video.

Lipreading Lip Reading +1

X-SepFormer: End-to-end Speaker Extraction Network with Explicit Optimization on Speaker Confusion

no code implementations9 Mar 2023 Kai Liu, Ziqing Du, Xucheng Wan, Huan Zhou

To mitigate the imperative SC issue, we reformulate the training objective and propose two novel loss schemes that explore the metric of reconstruction improvement performance defined at small chunk-level and leverage the metric associated distribution information.

Speech Extraction

Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings

no code implementations16 Jan 2023 Kai Liu, Xucheng Wan, Ziqing Du, Huan Zhou

As a practical alternative of speech separation, target speaker extraction (TSE) aims to extract the speech from the desired speaker using additional speaker cue extracted from the speaker.

Speaker Verification Speech Separation +1

Joint Speech Activity and Overlap Detection with Multi-Exit Architecture

no code implementations24 Sep 2022 Ziqing Du, Kai Liu, Xucheng Wan, Huan Zhou

Overlapped speech detection (OSD) is critical for speech applications in scenario of multi-party conversion.

Action Detection Activity Detection +1

Speech Enhancement with Perceptually-motivated Optimization and Dual Transformations

no code implementations24 Sep 2022 Xucheng Wan, Kai Liu, Ziqing Du, Huan Zhou

To validate the effectiveness of our proposed model, extensive experiments are conducted on the DNS2020 dataset.

Speech Enhancement

Cannot find the paper you are looking for? You can Submit a new open access paper.