Search Results for author: HaiHua Xu

Found 13 papers, 2 papers with code

Internal Language Model Estimation based Adaptive Language Model Fusion for Domain Adaptation

no code implementations • 2 Nov 2022 • Rao Ma, Xiaobo Wu, Jin Qiu, Yanan Qin, HaiHua Xu, Peihao Wu, Zejun Ma

The proposed method can achieve significantly better performance on the target test sets while it gets minimal performance degradation on the general test set, compared with both shallow and ILME-based LM fusion methods.

Domain Adaptation Language Modelling

Paper
Add Code

Speech-text based multi-modal training with bidirectional attention for improved speech recognition

1 code implementation • 1 Nov 2022 • Yuhang Yang, HaiHua Xu, Hao Huang, Eng Siong Chng, Sheng Li

To let the state-of-the-art end-to-end ASR model enjoy data efficiency, as well as much more unpaired text data by multi-modal training, one needs to address two problems: 1) the synchronicity of feature sampling rates between speech and language (aka text data); 2) the homogeneity of the learned representations from two encoders.

speech-recognition Speech Recognition

Paper
Code

Random Utterance Concatenation Based Data Augmentation for Improving Short-video Speech Recognition

no code implementations • 28 Oct 2022 • Yist Y. Lin, Tao Han, HaiHua Xu, Van Tung Pham, Yerbolat Khassanov, Tze Yuang Chong, Yi He, Lu Lu, Zejun Ma

One of limitations in end-to-end automatic speech recognition (ASR) framework is its performance would be compromised if train-test utterance lengths are mismatched.

Action Detection Activity Detection +4

Paper
Add Code

Reducing Language confusion for Code-switching Speech Recognition with Token-level Language Diarization

1 code implementation • 26 Oct 2022 • Hexin Liu, HaiHua Xu, Leibny Paola Garcia, Andy W. H. Khong, Yi He, Sanjeev Khudanpur

The comparison of the proposed methods indicates that incorporating language information is more effective than disentangling for reducing language confusion in CS speech.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Intermediate-layer output Regularization for Attention-based Speech Recognition with Shared Decoder

no code implementations • 9 Jul 2022 • Jicheng Zhang, Yizhou Peng, HaiHua Xu, Yi He, Eng Siong Chng, Hao Huang

Intermediate layer output (ILO) regularization by means of multitask training on encoder side has been shown to be an effective approach to yielding improved results on a wide range of end-to-end ASR frameworks.

speech-recognition Speech Recognition

Paper
Add Code

Internal Language Model Estimation based Language Model Fusion for Cross-Domain Code-Switching Speech Recognition

no code implementations • 9 Jul 2022 • Yizhou Peng, Yufei Liu, Jicheng Zhang, HaiHua Xu, Yi He, Hao Huang, Eng Siong Chng

More importantly, we train an end-to-end (E2E) speech recognition model by means of merging two monolingual data sets and observe the efficacy of the proposed ILME-based LM fusion for CSSR.

Language Modelling speech-recognition +1

Paper
Add Code

Internal Language Model Estimation Through Explicit Context Vector Learning for Attention-based Encoder-decoder ASR

no code implementations • 26 Jan 2022 • Yufei Liu, Rao Ma, HaiHua Xu, Yi He, Zejun Ma, Weibin Zhang

In this paper we propose two novel approaches to estimate the ILM based on Listen-Attend-Spell (LAS) framework.

Language Modelling Speech Recognition

Paper
Add Code

Minimum word error training for non-autoregressive Transformer-based code-switching ASR

no code implementations • 7 Oct 2021 • Yizhou Peng, Jicheng Zhang, HaiHua Xu, Hao Huang, Eng Siong Chng

Non-autoregressive end-to-end ASR framework might be potentially appropriate for code-switching recognition task thanks to its inherent property that present output token being independent of historical ones.

Paper
Add Code

Multitask-Based Joint Learning Approach To Robust ASR For Radio Communication Speech

no code implementations • 22 Jul 2021 • Duo Ma, Nana Hou, Van Tung Pham, HaiHua Xu, Eng Siong Chng

One of the advantage of the proposed method is that the entire system can be trained from scratch.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

E2E-based Multi-task Learning Approach to Joint Speech and Accent Recognition

no code implementations • 15 Jun 2021 • Jicheng Zhang, Yizhou Peng, Pham Van Tung, HaiHua Xu, Hao Huang, Eng Siong Chng

In this paper, we propose a single multi-task learning framework to perform End-to-End (E2E) speech recognition (ASR) and accent recognition (AR) simultaneously.

Multi-Task Learning speech-recognition +1

Paper
Add Code

Multilingual Approach to Joint Speech and Accent Recognition with DNN-HMM Framework

no code implementations • 22 Oct 2020 • Yizhou Peng, Jicheng Zhang, Haobo Zhang, HaiHua Xu, Hao Huang, Eng Siong Chng

Experimental results on an 8-accent English speech recognition show both methods can yield WERs close to the conventional ASR systems that completely ignore the accent, as well as desired AR accuracy.

speech-recognition Speech Recognition +1