no code implementations • 12 Jun 2023 • Anderson R. Avila, Mehdi Rezagholizadeh, Chao Xing
In this work, we investigate impacts of this ASR error propagation on state-of-the-art NLU systems based on pre-trained language models (PLM), such as BERT and RoBERTa.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • ICCV 2023 • Xinlin Li, Bang Liu, Rui Heng Yang, Vanessa Courville, Chao Xing, Vahid Partovi Nia
We further propose a sign-scale decomposition design to enhance training efficiency and a low-variance random initialization strategy to improve the model's transfer learning performance.
no code implementations • 15 Jul 2022 • Anderson R. Avila, Khalil Bibi, Rui Heng Yang, Xinlin Li, Chao Xing, Xiao Chen
Deep neural networks (DNN) have achieved impressive success in multiple domains.
no code implementations • 21 May 2022 • Abbas Ghaddar, Yimeng Wu, Sunyam Bagga, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais
There is a growing body of work in recent years to develop pre-trained language models (PLMs) for the Arabic language.
1 code implementation • 8 Dec 2021 • Abbas Ghaddar, Yimeng Wu, Ahmad Rashid, Khalil Bibi, Mehdi Rezagholizadeh, Chao Xing, Yasheng Wang, Duan Xinyu, Zhefeng Wang, Baoxing Huai, Xin Jiang, Qun Liu, Philippe Langlais
Language-specific pre-trained models have proven to be more accurate than multilingual ones in a monolingual evaluation setting, Arabic is no exception.
no code implementations • 29 Sep 2021 • Chao Xing, Dong Wang, LiRong Dai, Qun Liu, Anderson Avila
Overparameterized transformer-based architectures have shown remarkable performance in recent years, achieving state-of-the-art results in speech processing tasks such as speech recognition, speech synthesis, keyword spotting, and speech enhancement et al.
no code implementations • 20 May 2021 • Nihal Potdar, Anderson R. Avila, Chao Xing, Dong Wang, Yiran Cao, Xiao Chen
In this paper, we propose a streaming end-to-end framework that can process multiple intentions in an online and incremental way.
no code implementations • 17 Mar 2021 • Md Akmal Haidar, Chao Xing, Mehdi Rezagholizadeh
End-to-end automatic speech recognition (ASR), unlike conventional ASR, does not have modules to learn the semantic representation from speech encoder.
Ranked #12 on Speech Recognition on LibriSpeech test-clean
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
2 code implementations • 6 Nov 2016 • Bin-Bin Gao, Chao Xing, Chen-Wei Xie, Jianxin Wu, Xin Geng
However, it is difficult to collect sufficient training images with precise labels in some domains such as apparent age estimation, head pose estimation, multi-label classification and semantic segmentation.
Ranked #1 on Head Pose Estimation on BJUT-3D
no code implementations • CVPR 2016 • Chao Xing, Xin Geng, Hui Xue
In order to learn this general model family, this paper uses a method called Logistic Boosting Regression (LogitBoost) which can be seen as an additive weighted function regression from the statistical viewpoint.
no code implementations • 21 Apr 2016 • Qixin Wang, Tianyi Luo, Dong Wang, Chao Xing
Learning and generating Chinese poems is a charming yet challenging task.
no code implementations • 20 Oct 2015 • Lantian Li, Dong Wang, Chao Xing, Thomas Fang Zheng
Probabilistic linear discriminant analysis (PLDA) is a popular normalization approach for the i-vector model, and has delivered state-of-the-art performance in speaker recognition.
no code implementations • 20 Oct 2015 • Lantian Li, Dong Wang, Chao Xing, Kaimin Yu, Thomas Fang Zheng
The popular i-vector model represents speakers as low-dimensional continuous vectors (i-vectors), and hence it is a way of continuous speaker embedding.