no code implementations • 20 Mar 2024 • Xinyu Geng, JiaMing Wang, Jiawei Gong, Yuerong Xue, Jun Xu, Fanglin Chen, Xiaolin Huang
Redundancy is a persistent challenge in Capsule Networks (CapsNet), leading to high computational costs and parameter counts.
no code implementations • 13 Feb 2024 • Ziyang Ma, Guanrou Yang, Yifan Yang, Zhifu Gao, JiaMing Wang, Zhihao Du, Fan Yu, Qian Chen, Siqi Zheng, Shiliang Zhang, Xie Chen
We found that delicate designs are not necessary, while an embarrassingly simple composition of off-the-shelf speech encoder, LLM, and the only trainable linear projector is competent for the ASR task.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 14 Nov 2023 • JiaMing Wang, Harold Soh
To advance the field of autonomous robotics, particularly in object search tasks within unexplored environments, we introduce a novel framework centered around the Probable Object Location (POLo) score.
1 code implementation • 7 Oct 2023 • JiaMing Wang, Zhihao Du, Qian Chen, Yunfei Chu, Zhifu Gao, Zerui Li, Kai Hu, Xiaohuan Zhou, Jin Xu, Ziyang Ma, Wen Wang, Siqi Zheng, Chang Zhou, Zhijie Yan, Shiliang Zhang
In this paper, we propose LauraGPT, a unified GPT model for audio recognition, understanding, and generation.
no code implementations • 29 Aug 2023 • JiaMing Wang, Jiqian Dong, Sikai Chen, Shreyas Sundaram, Samuel Labi
In the first component of the framework, we develop a realistic reinforcement learning environment termed "ChargingEnv" which incorporates a reliable charging simulation system that accounts for common practical issues in wireless charging deployment, specifically, the charging panel misalignment.
1 code implementation • 24 Aug 2023 • Wenyu Zhu, Hao Wang, Yuchen Zhou, JiaMing Wang, Zihan Sha, Zeyu Gao, Chao Zhang
By feeding explicit knowledge as additional inputs to the Transformer, and fusing implicit knowledge with a novel pre-training task, kTrans provides a new perspective to incorporating domain knowledge into a Transformer framework.
1 code implementation • 18 May 2023 • Zhifu Gao, Zerui Li, JiaMing Wang, Haoneng Luo, Xian Shi, Mengzhe Chen, Yabin Li, Lingyun Zuo, Zhihao Du, Zhangyu Xiao, Shiliang Zhang
FunASR offers models trained on large-scale industrial corpora and the ability to deploy them in applications.
Ranked #1 on Speech Recognition on WenetSpeech (using extra training data)
1 code implementation • 8 Mar 2023 • JiaMing Wang, Zhihao Du, Shiliang Zhang
Recently, end-to-end neural diarization (EEND) is introduced and achieves promising results in speaker-overlapped scenarios.
Ranked #1 on Speaker Diarization on CALLHOME
1 code implementation • 29 Nov 2022 • Xiaohuan Zhou, JiaMing Wang, Zeyu Cui, Shiliang Zhang, Zhijie Yan, Jingren Zhou, Chang Zhou
Therefore, we propose to introduce the phoneme modality into pre-training, which can help capture modality-invariant information between Mandarin speech and text.
Ranked #2 on Speech Recognition on AISHELL-1
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 16 Sep 2021 • Yuanzhi Wang, Tao Lu, Yanduo Zhang, Junjun Jiang, JiaMing Wang, Zhongyuan Wang, Jiayi Ma
Recently, face super-resolution (FSR) methods either feed whole face image into convolutional neural networks (CNNs) or utilize extra facial priors (e. g., facial parsing maps, facial landmarks) to focus on facial structure, thereby maintaining the consistency of the facial structure while restoring facial details.
1 code implementation • 5 Jun 2021 • Zhenfeng Shao, JiaMing Wang, Lianbing Deng, Xiao Huang, Tao Lu, Fang Luo, Ruiqian Zhang, Xianwei Lv, Chaoya Dang, Qing Ding, Zhiqiang Wang
In this paper, we introduce a challenging global large-scale ship database (called GLSD), designed specifically for ship detection tasks.
1 code implementation • 24 May 2021 • JiaMing Wang, Zhenfeng Shao, Xiao Huang, Tao Lu, Ruiqian Zhang, Jiayi Ma
Most existing deep learning-based pan-sharpening methods have several widely recognized issues, such as spectral distortion and insufficient spatial texture enhancement, we propose a novel pan-sharpening convolutional neural network based on a high-pass modification block.
no code implementations • 23 May 2021 • Zhiqiang Wang, Zhenfeng Shao, Xiao Huang, JiaMing Wang, Tao Lu, Sihang Zhang
In this study, we propose a novel HSI denoising network, termed SSCAN, that combines group convolutions and attention modules.
1 code implementation • 8 May 2021 • JiaMing Wang, Zhenfeng Shao, Tao Lu, Xiao Huang, Ruiqian Zhang, Yu Wang
Despite their success, however, low/high spatial resolution pairs are usually difficult to obtain in satellites with a high temporal resolution, making such approaches in SR impractical to use.