1 code implementation • 19 Mar 2024 • Pengfei Zhu, Yang Sun, Bing Cao, QinGhua Hu
These adapters are shared across different tasks and constrained by mutual information regularization, ensuring compatibility with different tasks while complementarity for multi-source images.
1 code implementation • 17 Dec 2023 • Bing Cao, Junliang Guo, Pengfei Zhu, QinGhua Hu
To handle this problem, we propose a novel multi-modal visual prompt tracking model based on a universal bi-directional adapter, cross-prompting multiple modalities mutually.
Ranked #6 on Rgb-T Tracking on LasHeR
1 code implementation • 26 Nov 2023 • Yichen Bai, Zongbo Han, Changqing Zhang, Bing Cao, Xiaoheng Jiang, QinGhua Hu
Out-of-distribution (OOD) detection methods often exploit auxiliary outliers to train model identifying OOD samples, especially discovering challenging outliers from auxiliary outliers dataset to improve OOD detection.
no code implementations • 30 Aug 2023 • Jun Li, Jingjian Wang, Hongwei Wang, Xing Deng, Jielong Chen, Bing Cao, Zekun Wang, Guanjie Xu, Ge Zhang, Feng Shi, Hualei Liu
(ii) Integrate Network (IN) builds a new integrated sequence by utilizing spatial-temporal interaction on MSS and captures the comprehensive spatial-temporal representation by modeling the integrated sequence with a complicated attention.
1 code implementation • IEEE Transactions on Circuits and Systems for Video Technology 2023 • Guanlin Chen, Pengfei Zhu, Bing Cao, Xing Wang, QinGhua Hu
During the tracking process, a cross-drone mapping mechanism is proposed by using the surrounding information of the drone with promising tracking status as reference, assisting drones that lost targets to re-calibrate, which implements real-time cross-drone information interaction.
1 code implementation • ICCV 2023 • Yiming Sun, Bing Cao, Pengfei Zhu, QinGhua Hu
The MoLE performs specialized learning of multi-modal local features, prompting the fused images to retain the local information in a sample-adaptive manner, while the MoGE focuses on the global information that complements the fused image with overall texture detail and contrast.
1 code implementation • 13 Dec 2022 • Qinghe Wang, Lijie Liu, Miao Hua, Pengfei Zhu, WangMeng Zuo, QinGhua Hu, Huchuan Lu, Bing Cao
We blend the semantic layouts of source head and source body, and then inpaint the transition region by the semantic layout generator, achieving a coarse-grained head swapping.
1 code implementation • ACMMM 2022 • Yiming Sun, Bing Cao, Pengfei Zhu, QinGhua Hu
We cascade the image fusion network with the detection networks of both modalities and use the detection loss of the fused images to provide guidance on task-related information for the optimization of the image fusion network.
1 code implementation • CVPR 2022 • Zhengyao Lv, Xiaoming Li, Zhenxing Niu, Bing Cao, WangMeng Zuo
Obviously, a fine-grained part-level semantic layout will benefit object details generation, and it can be roughly inferred from an object's shape.
1 code implementation • 19 Mar 2022 • Junwen Pan, Pengfei Zhu, Kaihua Zhang, Bing Cao, Yu Wang, Dingwen Zhang, Junwei Han, QinGhua Hu
Semantic segmentation with limited annotations, such as weakly supervised semantic segmentation (WSSS) and semi-supervised semantic segmentation (SSSS), is a challenging task that has attracted much attention recently.
Ranked #34 on Weakly-Supervised Semantic Segmentation on COCO 2014 val
no code implementations • 25 May 2020 • Bing Cao, Nannan Wang, Xinbo Gao, Jie Li, Zhifeng Li
Heterogeneous face recognition (HFR) refers to matching face images acquired from different domains with wide applications in security scenarios.
2 code implementations • 5 Mar 2020 • Yiming Sun, Bing Cao, Pengfei Zhu, QinGhua Hu
To address this dilemma, we further propose an uncertainty-aware cross-modality vehicle detection (UA-CMDet) framework to extract complementary information from cross-modal images, which can significantly improve the detection performance in low light conditions.