no code implementations • 21 Feb 2024 • Ziyi Guan, Hantao Huang, Yupeng Su, Hong Huang, Ngai Wong, Hao Yu
Large Language Models (LLMs) have greatly advanced the natural language processing paradigm.
no code implementations • 30 Oct 2022 • Bozhong Liu, Xiaoxi Yu, Hantao Huang
Acoustic echo cancellation (AEC) is designed to remove echoes, reverberation, and unwanted added sounds from the microphone signal while maintaining the quality of the near-end speaker's speech.
no code implementations • 25 Oct 2021 • Wei Han, Hantao Huang, Xiaoxi Yu
Holistic object representation-based trackers suffer from performance drop under large appearance change such as deformation and occlusion.
no code implementations • 2 Jul 2021 • Tao Han, Hantao Huang, Ziang Yang, Wei Han
Neural network based speech recognition systems suffer from performance degradation due to accented speech, especially unfamiliar accents.
no code implementations • 15 Jun 2021 • Po-Yu Chen, Hao Chen, Yi-Min Tsai, Hsien-Kai Kuo, Hantao Huang, Hsin-Hung Chen, Sheng-Hong Yan, Wei-Lun Ou, Chia-Ming Cheng
In the proposed framework, Deep Neural Networks (DNNs) are used to learn the characteristics of the PAs, while, correspondent Digital Pre-Distortions (DPDs) are also learned to compensate for the nonlinear and memory effects of PAs.
no code implementations • 17 Oct 2020 • Hantao Huang, Tao Han, Wei Han, Deep Yap, Cheng-Ming Chiang
From the human perspective, to answer a visual question, one needs to read the question and then refer to the image to generate an answer.
no code implementations • COLING 2020 • Wei Han, Hantao Huang, Tao Han
Positional information of text is underused and there is a lack of evidence for the generated answer.
Optical Character Recognition
Optical Character Recognition (OCR)
+2