no code implementations • 14 Oct 2022 • Zhipeng Shen, Shiyu Zhou, Jianglong Yu
Combining machine learning and convex optimization, this paper presents a real-time computational guidance method for the 6-degrees-of-freedom powered landing guidance problem.
1 code implementation • 30 Jan 2022 • Minglun Han, Linhao Dong, Zhenlin Liang, Meng Cai, Shiyu Zhou, Zejun Ma, Bo Xu
Nowadays, most methods in end-to-end contextual speech recognition bias the recognition process towards contextual knowledge.
2 code implementations • 1 Jul 2021 • Jing Liu, Xinxin Zhu, Fei Liu, Longteng Guo, Zijia Zhao, Mingzhen Sun, Weining Wang, Hanqing Lu, Shiyu Zhou, Jiajun Zhang, Jinqiao Wang
In this paper, we propose an Omni-perception Pre-Trainer (OPT) for cross-modal understanding and generation, by jointly modeling visual, text and audio resources.
Ranked #1 on Image Retrieval on Localized Narratives
no code implementations • 17 Jan 2021 • Cheng Yi, Shiyu Zhou, Bo Xu
In this work, we fuse a pre-trained acoustic encoder (wav2vec2. 0) and a pre-trained linguistic encoder (BERT) into an end-to-end ASR model.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 22 Dec 2020 • Cheng Yi, Jianzhong Wang, Ning Cheng, Shiyu Zhou, Bo Xu
To verify its universality over languages, we apply pre-trained models to solve low-resource speech recognition tasks in various spoken languages.
no code implementations • 17 Dec 2020 • Minglun Han, Linhao Dong, Shiyu Zhou, Bo Xu
End-to-end (E2E) models have achieved promising results on multiple speech recognition benchmarks, and shown the potential to become the mainstream.
no code implementations • 11 Dec 2020 • Zhiyun Fan, Meng Li, Shiyu Zhou, Bo Xu
Then we demonstrate the effectiveness of wav2vec 2. 0 on the two tasks respectively.
no code implementations • 6 Nov 2020 • Salman Jahani, Shiyu Zhou, Dharmaraj Veeramani, Jeff Schmidt
Prediction of events such as part replacement and failure events plays a critical role in reliability engineering.
no code implementations • 20 May 2020 • Linhao Dong, Cheng Yi, Jianzong Wang, Shiyu Zhou, Shuang Xu, Xueli Jia, Bo Xu
End-to-end models are gaining wider attention in the field of automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 2 Jan 2020 • Zhiyun Fan, Jie Li, Shiyu Zhou, Bo Xu
We investigate different factors of SAM.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 28 Oct 2019 • Zhiyun Fan, Shiyu Zhou, Bo Xu
The unsupervised pre-training is finished on AISHELL-2 dataset and we apply the pre-trained model to multiple paired data ratios of AISHELL-1 and HKUST.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • 31 Jan 2019 • Raed Kontar, Garvesh Raskutti, Shiyu Zhou
The proposed method has excellent scalability when the number of outputs is large and minimizes the negative transfer of knowledge between uncorrelated outputs.
no code implementations • 17 Jun 2018 • Linhao Dong, Shiyu Zhou, Wei Chen, Bo Xu
End-to-end models have been showing superiority in Automatic Speech Recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 12 Jun 2018 • Shiyu Zhou, Shuang Xu, Bo Xu
Experiments on CALLHOME datasets demonstrate that the multilingual ASR Transformer with the language symbol at the end performs better and can obtain relatively 10. 5\% average word error rate (WER) reduction compared to SHL-MLSTM with residual learning.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 16 May 2018 • Shiyu Zhou, Linhao Dong, Shuang Xu, Bo Xu
Experiments on HKUST datasets demonstrate that the lexicon free modeling units can outperform lexicon related modeling units in terms of character error rate (CER).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
1 code implementation • 28 Apr 2018 • Shiyu Zhou, Linhao Dong, Shuang Xu, Bo Xu
Furthermore, we investigate a comparison between syllable based model and context-independent phoneme (CI-phoneme) based model with the Transformer in Mandarin Chinese.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +7