no code implementations • 25 Jun 2024 • Van Tung Pham, Yist Lin, Tao Han, Wei Li, Jun Zhang, Lu Lu, Yuxuan Wang
Finally, we explore training and inference methods to mitigate high insertion errors.
no code implementations • 14 Dec 2023 • Yi Guo, Yiqian He, Xiaoyang Li, Haotong Qin, Van Tung Pham, Yang Zhang, Shouda Liu
Knowledge Distillation (KD) emerges as one of the most promising compression technologies to run advanced deep neural networks on resource-limited devices.
no code implementations • 28 Oct 2022 • Yist Y. Lin, Tao Han, HaiHua Xu, Van Tung Pham, Yerbolat Khassanov, Tze Yuang Chong, Yi He, Lu Lu, Zejun Ma
One of limitations in end-to-end automatic speech recognition (ASR) framework is its performance would be compromised if train-test utterance lengths are mismatched.
no code implementations • 22 Jul 2021 • Duo Ma, Nana Hou, Van Tung Pham, HaiHua Xu, Eng Siong Chng
One of the advantage of the proposed method is that the entire system can be trained from scratch.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 13 Jan 2021 • Manav Kaushik, Van Tung Pham, Eng Siong Chng
In this work, we propose a novel approach of using attention mechanism to build an end-to-end architecture for height and age estimation.
no code implementations • 21 May 2020 • Zhiping Zeng, Van Tung Pham, Hai-Hua Xu, Yerbolat Khassanov, Eng Siong Chng, Chongjia Ni, Bin Ma
To this end, we extend our prior work [1], and propose a hybrid Transformer-LSTM based architecture.
no code implementations • 18 May 2020 • Tingzhi Mao, Yerbolat Khassanov, Van Tung Pham, Hai-Hua Xu, Hao Huang, Eng Siong Chng
In this paper, we present a series of complementary approaches to improve the recognition of underrepresented named entities (NE) in hybrid ASR systems without compromising overall word error rate performance.
no code implementations • 25 Nov 2019 • Van Tung Pham, Hai-Hua Xu, Yerbolat Khassanov, Zhiping Zeng, Eng Siong Chng, Chongjia Ni, Bin Ma, Haizhou Li
To address this problem, in this work, we propose a new architecture that separates the decoder subnet from the encoder output.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+4
1 code implementation • 8 Apr 2019 • Yerbolat Khassanov, Zhiping Zeng, Van Tung Pham, Hai-Hua Xu, Eng Siong Chng
However, learning the representation of rare words is a challenging problem causing the NLM to produce unreliable probability estimates.
no code implementations • 8 Apr 2019 • Yerbolat Khassanov, Hai-Hua Xu, Van Tung Pham, Zhiping Zeng, Eng Siong Chng, Chongjia Ni, Bin Ma
The lack of code-switch training data is one of the major concerns in the development of end-to-end code-switching automatic speech recognition (ASR) models.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
1 code implementation • 1 Nov 2018 • Zhiping Zeng, Yerbolat Khassanov, Van Tung Pham, Hai-Hua Xu, Eng Siong Chng, Haizhou Li
Code-switching (CS) refers to a linguistic phenomenon where a speaker uses different languages in an utterance or between alternating utterances.
no code implementations • MediaEval 2015 Workshop 2015 • Jingyong Hou, Van Tung Pham, Cheung-Chi Leung, Lei Wang, HaiHua Xu, Hang Lv, Lei Xie, Zhonghua Fu, Chongjia Ni, Xiong Xiao, Hongjie Chen, Shaofei Zhang, Sining Sun, Yougen Yuan, Pengcheng Li, Tin Lay Nwe, Sunil Sivadas, Bin Ma, Eng Siong Chng, Haizhou Li
This paper describes the system developed by the NNI team for the Query-by-Example Search on Speech Task (QUESST) in the MediaEval 2015 evaluation.
Ranked #9 on
Keyword Spotting
on QUESST