Search Results for author: Xucheng Wan

Found 5 papers, 0 papers with code

Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder

no code implementations8 Apr 2024 He Wang, Pengcheng Guo, Xucheng Wan, Huan Zhou, Lei Xie

Automatic lip-reading (ALR) aims to automatically transcribe spoken content from a speaker's silent lip motion captured in video.

Lipreading Lip Reading +1

X-SepFormer: End-to-end Speaker Extraction Network with Explicit Optimization on Speaker Confusion

no code implementations9 Mar 2023 Kai Liu, Ziqing Du, Xucheng Wan, Huan Zhou

To mitigate the imperative SC issue, we reformulate the training objective and propose two novel loss schemes that explore the metric of reconstruction improvement performance defined at small chunk-level and leverage the metric associated distribution information.

Speech Extraction

Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings

no code implementations16 Jan 2023 Kai Liu, Xucheng Wan, Ziqing Du, Huan Zhou

As a practical alternative of speech separation, target speaker extraction (TSE) aims to extract the speech from the desired speaker using additional speaker cue extracted from the speaker.

Speaker Verification Speech Separation +1

Joint Speech Activity and Overlap Detection with Multi-Exit Architecture

no code implementations24 Sep 2022 Ziqing Du, Kai Liu, Xucheng Wan, Huan Zhou

Overlapped speech detection (OSD) is critical for speech applications in scenario of multi-party conversion.

Action Detection Activity Detection +1

Speech Enhancement with Perceptually-motivated Optimization and Dual Transformations

no code implementations24 Sep 2022 Xucheng Wan, Kai Liu, Ziqing Du, Huan Zhou

To validate the effectiveness of our proposed model, extensive experiments are conducted on the DNS2020 dataset.

Speech Enhancement

Cannot find the paper you are looking for? You can Submit a new open access paper.