1 code implementation • 1 Apr 2024 • Jing Hao, Lei He, Kuo Feng Hung
To address this issue, we propose T-Mamba, integrating shared positional encoding and frequency-based features into vision mamba, to address limitations in spatial position preservation and feature enhancement in frequency domain.
1 code implementation • 27 Jan 2024 • Jing Hao, Moyun Liu, Kuo Feng Hung
To segment glass surfaces with higher accuracy, we make full use of two visual foundation models: Segment Anything (SAM) and Stable Diffusion. Specifically, we devise a simple glass surface segmentor named GEM, which only consists of a SAM backbone, a simple feature pyramid, a discerning query selection module, and a mask decoder.
no code implementations • 18 Sep 2023 • ZiMing Wang, Shumin Han, Xiaodi Wang, Jing Hao, Xianbin Cao, Baochang Zhang
Masked image modeling (MIM) methods achieve great success in various visual tasks but remain largely unexplored in knowledge distillation for heterogeneous deep models.
no code implementations • 22 Jul 2023 • Jing Hao, Jingming Xie, Jinyuan Zhang, Moyun Liu
However, the fisheye image has a severe geometric distortion which may interfere with the stage of image registration and stitching.
no code implementations • 22 Jul 2023 • Yuwen Zhai, Jing Hao, Liang Gao, Xinyu Li, Yiping Gao, Shumin Han
The hybrid model of self-attention and convolution is one of the methods to lighten ViT.
no code implementations • 7 Apr 2023 • Jing Hao, Song Chen, Xiaodi Wang, Shumin Han
Pretraining on large-scale datasets can boost the performance of object detectors while the annotated datasets for object detection are hard to scale up due to the high labor cost.
no code implementations • 18 Oct 2021 • Moyun Liu, Jingming Xie, Jing Hao, Yang Zhang, Xuzhan Chen, Youping Chen
Based on SCE module, a narrow network is designed for final weld information recognition.
no code implementations • ICCV 2021 • Jing Hao, Zhixin Zhang, Shicai Yang, Di Xie, ShiLiang Pu
Nowadays advanced image editing tools and technical skills produce tampered images more realistically, which can easily evade image forensic systems and make authenticity verification of images more difficult.
no code implementations • 21 Nov 2019 • Jiaxu Chen, Jing Hao, Kai Chen, Di Xie, Shicai Yang, ShiLiang Pu
This paper introduces an end-to-end audio classification system based on raw waveforms and mix-training strategy.