Search Results for author: Xianrui Zheng

Found 7 papers, 1 papers with code

Conditional Diffusion Model for Target Speaker Extraction

no code implementations • 7 Oct 2023 • Theodor Nguyen, Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C Woodland

For the reverse-time process, a parametrised score function is conditioned on a target speaker embedding to extract the target speaker from the mixture of sources.

Target Speaker Extraction

Paper
Add Code

Can Contextual Biasing Remain Effective with Whisper and GPT-2?

1 code implementation • 2 Jun 2023 • Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C. Woodland

End-to-end automatic speech recognition (ASR) and large language models, such as Whisper and GPT-2, have recently been scaled to use vast amounts of training data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Self-Supervised Learning-Based Source Separation for Meeting Data

no code implementations • 3 Apr 2023 • Yuang Li, Xianrui Zheng, Philip C. Woodland

In this paper, seven SSL models were compared on both simulated and real-world corpora.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription

no code implementations • 8 Jul 2022 • Xianrui Zheng, Chao Zhang, Philip C. Woodland

Self-supervised-learning-based pre-trained models for speech data, such as Wav2Vec 2. 0 (W2V2), have become the backbone of many speech tasks.

Action Detection Activity Detection +3

Paper
Add Code

Multi-turn RNN-T for streaming recognition of multi-party speech

no code implementations • 19 Dec 2021 • Ilya Sklyar, Anna Piunova, Xianrui Zheng, YuLan Liu

Second, we propose a novel multi-turn RNN-T (MT-RNN-T) model with an overlap-based target arrangement strategy that generalizes to an arbitrary number of speakers without changes in the model architecture.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Adapting GPT, GPT-2 and BERT Language Models for Speech Recognition

no code implementations • 29 Jul 2021 • Xianrui Zheng, Chao Zhang, Philip C. Woodland

Furthermore, on the AMI corpus, the proposed conversion for language prior probabilities enables BERT to obtain an extra 3% relative WERR, and the combination of BERT, GPT and GPT-2 results in further improvements.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems

no code implementations • 23 Nov 2020 • Xianrui Zheng, YuLan Liu, Deniz Gunceler, Daniel Willett

Different regularisation techniques are explored and the best performance is achieved by fine-tuning the RNN-T on both original training data and extra synthetic data with elastic weight consolidation (EWC) applied on the encoder.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.