no code implementations • 11 Dec 2023 • Xincheng Yu, Dongyue Guo, Jianwei Zhang, Yi Lin
Radio speech echo is a specific phenomenon in the air traffic control (ATC) domain, which degrades speech quality and further impacts automatic speech recognition (ASR) accuracy.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 1 Nov 2023 • Wenjie Ou, Dongyue Guo, Zheng Zhang, Zhishuo Zhao, Yi Lin
We present a highly accurate and simply structured CNN-based model for long-term time series forecasting tasks, called WinNet, including (i) Inter-Intra Period Encoder (I2PE) to transform 1D sequence into 2D tensor with long and short periodicity according to the predefined periodic window, (ii) Two-Dimensional Period Decomposition (TDPD) to model period-trend and oscillation terms, and (iii) Decomposition Correlation Block (DCB) to leverage the correlations of the period-trend and oscillation terms to support the prediction tasks by CNNs.
no code implementations • 2 May 2023 • Dongyue Guo, Jianwei Zhang, Yi Lin
A major reason is that spoken instructions and flight trajectories are presented in different modalities in the current air traffic control (ATC) system, bringing great challenges to considering the maneuvering instruction in the FTP tasks.
no code implementations • 2 May 2023 • Dongyue Guo, Zheng Zhang, Zhen Yan, Jianwei Zhang, Yi Lin
Flight Trajectory Prediction (FTP) is an essential task in Air Traffic Control (ATC), which can assist air traffic controllers in managing airspace more safely and efficiently.
no code implementations • 4 Nov 2021 • Peng Fan, Dongyue Guo, Yi Lin, Bo Yang, Jianwei Zhang
In this work, we propose a new automatic speech recognition (ASR) system based on feature learning and an end-to-end training procedure for air traffic control (ATC) systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 3 Nov 2021 • Dongyue Guo, Jianwei Zhang, Bo Yang, Yi Lin
Most importantly, a multi-modal speaker role identification network (MMSRINet) is designed to achieve the SRI task by considering both the speech and textual modality features.
no code implementations • 17 Feb 2021 • Yi Lin, Bo Yang, Linchao Li, Dongyue Guo, Jianwei Zhang, Hu Chen, Yi Zhang
Finally, by integrating the SRL with ASR, an end-to-end multilingual ASR framework is formulated in a supervised manner, which is able to translate the raw wave into text in one model, i. e., wave-to-text.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3