Search Results for author: Xianrui Zheng

Found 7 papers, 1 papers with code

Conditional Diffusion Model for Target Speaker Extraction

no code implementations7 Oct 2023 Theodor Nguyen, Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C Woodland

For the reverse-time process, a parametrised score function is conditioned on a target speaker embedding to extract the target speaker from the mixture of sources.

Target Speaker Extraction

Can Contextual Biasing Remain Effective with Whisper and GPT-2?

1 code implementation2 Jun 2023 Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C. Woodland

End-to-end automatic speech recognition (ASR) and large language models, such as Whisper and GPT-2, have recently been scaled to use vast amounts of training data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription

no code implementations8 Jul 2022 Xianrui Zheng, Chao Zhang, Philip C. Woodland

Self-supervised-learning-based pre-trained models for speech data, such as Wav2Vec 2. 0 (W2V2), have become the backbone of many speech tasks.

Action Detection Activity Detection +3

Multi-turn RNN-T for streaming recognition of multi-party speech

no code implementations19 Dec 2021 Ilya Sklyar, Anna Piunova, Xianrui Zheng, YuLan Liu

Second, we propose a novel multi-turn RNN-T (MT-RNN-T) model with an overlap-based target arrangement strategy that generalizes to an arbitrary number of speakers without changes in the model architecture.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Adapting GPT, GPT-2 and BERT Language Models for Speech Recognition

no code implementations29 Jul 2021 Xianrui Zheng, Chao Zhang, Philip C. Woodland

Furthermore, on the AMI corpus, the proposed conversion for language prior probabilities enables BERT to obtain an extra 3% relative WERR, and the combination of BERT, GPT and GPT-2 results in further improvements.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Using Synthetic Audio to Improve The Recognition of Out-Of-Vocabulary Words in End-To-End ASR Systems

no code implementations23 Nov 2020 Xianrui Zheng, YuLan Liu, Deniz Gunceler, Daniel Willett

Different regularisation techniques are explored and the best performance is achieved by fine-tuning the RNN-T on both original training data and extra synthetic data with elastic weight consolidation (EWC) applied on the encoder.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Cannot find the paper you are looking for? You can Submit a new open access paper.