no code implementations • 14 Dec 2023 • Keqi Deng, Philip C. Woodland
Recently, connectionist temporal classification (CTC)-based end-to-end (E2E) automatic speech recognition (ASR) models have achieved impressive results, especially with the development of self-supervised learning.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 19 Nov 2023 • Keqi Deng, Philip C. Woodland
An Auto-regressive Integrate-and-Fire (AIF) mechanism is proposed to generate the label-level encoder representation while retaining low latency operation that can be used for streaming.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 25 Aug 2023 • Keqi Deng, Philip C. Woodland
Although end-to-end (E2E) trainable automatic speech recognition (ASR) has shown great success by jointly learning acoustic and linguistic information, it still suffers from the effect of domain shifts, thus limiting potential applications.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 6 Jul 2023 • Keqi Deng, Philip C. Woodland
Hence blank tokens are no longer needed and the prediction network can be easily adapted using text data.
no code implementations • 16 Feb 2023 • Keqi Deng, Philip C. Woodland
End-to-end (E2E) automatic speech recognition (ASR) implicitly learns the token sequence distribution of paired audio-transcript training data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 6 Jul 2022 • Zehan Li, Haoran Miao, Keqi Deng, Gaofeng Cheng, Sanli Tian, Ta Li, Yonghong Yan
Firstly, we introduce a real-time encoder states revision strategy to modify previous states.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 19 Apr 2022 • Keqi Deng, Shinji Watanabe, Jiatong Shi, Siddhant Arora
Although Transformers have gained success in several speech processing tasks like spoken language understanding (SLU) and speech translation (ST), achieving online processing while keeping competitive performance is still essential for real-world interaction.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 22 Feb 2022 • Keqi Deng, Songjun Cao, Yike Zhang, Long Ma, Gaofeng Cheng, Ji Xu, Pengyuan Zhang
Recently, end-to-end automatic speech recognition models based on connectionist temporal classification (CTC) have achieved impressive results, especially when fine-tuned from wav2vec2. 0 models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 25 Jan 2022 • Keqi Deng, Zehui Yang, Shinji Watanabe, Yosuke Higuchi, Gaofeng Cheng, Pengyuan Zhang
The proposed NAR model significantly surpasses previous NAR systems on the AISHELL-1 benchmark and shows a potential for English tasks.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 14 Dec 2021 • Keqi Deng, Songjun Cao, Yike Zhang, Long Ma
In our framework, the encoder is initialized with a pretrained AM (wav2vec2. 0).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 15 Sep 2021 • Keqi Deng, Songjun Cao, Long Ma
For the former task, a standard deviation constraint loss (SDC-loss) based end-to-end (E2E) architecture is proposed to identify accents under the same language.