no code implementations • 8 Apr 2024 • Zhipeng Zhang, Zhimin Wei, Guolei Sun, Peng Wang, Luc van Gool
In the field of visual affordance learning, previous methods mainly used abundant images or videos that delineate human behavior patterns to identify action possibility regions for object manipulation, with a variety of applications in robotic tasks.
no code implementations • 8 Mar 2024 • Liting Lin, Heng Fan, Zhipeng Zhang, YaoWei Wang, Yong Xu, Haibin Ling
The shared embeddings, which describe the absolute coordinates of multi-resolution images (namely, the template and search images), are inherited from the pre-trained backbones.
1 code implementation • 6 Mar 2024 • Liang Peng, Junyuan Gao, Xinran Liu, Weihong Li, Shaohua Dong, Zhipeng Zhang, Heng Fan, Libo Zhang
The rich annotations of VastTrack enables development of both the vision-only and the vision-language tracking.
no code implementations • 3 Feb 2024 • Zixiang Zhao, Lilun Deng, Haowen Bai, Yukun Cui, Zhipeng Zhang, Yulun Zhang, Haotong Qin, Dongdong Chen, Jiangshe Zhang, Peng Wang, Luc van Gool
Therefore, we introduce a novel fusion paradigm named image Fusion via vIsion-Language Model (FILM), for the first time, utilizing explicit textual information in different source images to guide image fusion.
1 code implementation • 20 Jan 2024 • Du Chen, Yi Huang, Xiaopu Li, Yongqiang Li, Yongqiang Liu, Haihui Pan, Leichao Xu, Dacheng Zhang, Zhipeng Zhang, Kun Han
In this study, we introduce Orion-14B, a collection of multilingual large language models with 14 billion parameters.
no code implementations • 20 Jul 2023 • Zhipeng Zhang, Piao Tong, Yingwei Ma, Qiao Liu, Xujiang Liu, Xu Luo
Furthermore, we introduce a novel Decoupled Contrastive Learning method to enhance the effectiveness of the language representation.
1 code implementation • 19 Jul 2023 • Mingzhe Guo, Zhipeng Zhang, Liping Jing, Haibin Ling, Heng Fan
To thoroughly evidence the effectiveness of our method, we integrate the proposed framework on three tracking methods with different designs, i. e., the CNN-based SiamCAR, the Transformer-based OSTrack, and the hybrid structure TransT.
no code implementations • 20 Mar 2023 • Zhenyu Li, Zhipeng Zhang, Heng Fan, Yuan He, Ke Wang, Xianming Liu, Junjun Jiang
In this paper, we improve the challenging monocular 3D object detection problem with a general semi-supervised framework.
no code implementations • CVPR 2023 • Weiming Bai, Yufan Liu, Zhipeng Zhang, Bing Li, Weiming Hu
Observing that face manipulation may alter the relation between different facial action units (AU), we propose the Action Units Relation Learning framework to improve the generality of forgery detection.
no code implementations • 14 Nov 2022 • Linfeng Zhang, Yukang Shi, Hung-Shuo Tai, Zhipeng Zhang, Yuan He, Ke Wang, Kaisheng Ma
Detecting 3D objects from multi-view images is a fundamental problem in 3D computer vision.
no code implementations • 31 Jul 2022 • Zhipeng Zhang, Zhimin Wei, Zhongzhen Huang, Rui Niu, Peng Wang
However, one unsolved issue of these models is that the number of reasoning steps needs to be pre-defined and fixed before inference, ignoring the varying complexity of expressions.
1 code implementation • 3 Jul 2022 • Mingzhe Guo, Zhipeng Zhang, Heng Fan, Liping Jing
By revealing the potential of VL representation, we expect the community to divert more attention to VL tracking and hope to open more possibilities for future tracking beyond Transformer.
no code implementations • 7 May 2022 • Zhipeng Zhang, Xinglin Hou, Kai Niu, Zhongzhen Huang, Tiezheng Ge, Yuning Jiang, Qi Wu, Peng Wang
Therefore, we present a dataset, E-MMAD (e-commercial multimodal multi-structured advertisement copywriting), which requires, and supports much more detailed information in text generation.
no code implementations • 7 Jan 2022 • Mingzhe Guo, Zhipeng Zhang, Heng Fan, Liping Jing, Yilin Lyu, Bing Li, Weiming Hu
The proposed GIM module and InBN mechanism are general and applicable to different backbone types including CNN and Transformer for improvements, as evidenced by our extensive experiments on multiple benchmarks.
1 code implementation • 2 Dec 2021 • Liting Lin, Heng Fan, Zhipeng Zhang, Yong Xu, Haibin Ling
The potential of Transformer in representation learning remains under-explored.
Ranked #9 on Visual Object Tracking on TrackingNet
2 code implementations • 11 Aug 2021 • Xiao Wang, Jianing Li, Lin Zhu, Zhipeng Zhang, Zhe Chen, Xin Li, YaoWei Wang, Yonghong Tian, Feng Wu
Different from visible cameras which record intensity images frame by frame, the biologically inspired event camera produces a stream of asynchronous and sparse events with much lower latency.
Ranked #1 on Object Tracking on VisEvent
1 code implementation • ICCV 2021 • Zhipeng Zhang, Yihao Liu, Xiao Wang, Bing Li, Weiming Hu
Siamese tracking has achieved groundbreaking performance in recent years, where the essence is the efficient matching operator cross-correlation and its variants.
no code implementations • 1 Aug 2021 • Yihao Liu, Anran Liu, Jinjin Gu, Zhipeng Zhang, Wenhao Wu, Yu Qiao, Chao Dong
We show that a well-trained deep SR network is naturally a good descriptor of degradation information.
1 code implementation • 19 Apr 2021 • Chao Liang, Zhipeng Zhang, Xue Zhou, Bing Li, Weiming Hu
Eventually, it helps to reload the ``fake background'' and repair the broken tracklets.
2 code implementations • CVPR 2021 • Xiao Wang, Xiujun Shu, Zhipeng Zhang, Bo Jiang, YaoWei Wang, Yonghong Tian, Feng Wu
We believe this benchmark will greatly boost related researches on natural language guided tracking.
Ranked #3 on Visual Object Tracking on TNL2K (precision metric)
no code implementations • ICCV 2021 • Xiuli Bi, Zhipeng Zhang, Bin Xiao
For detecting the tampered regions, a forgery localization generator GM is proposed based on a multi-decoder-single-task strategy.
4 code implementations • 23 Oct 2020 • Chao Liang, Zhipeng Zhang, Xue Zhou, Bing Li, Shuyuan Zhu, Weiming Hu
However, the inherent differences and relations between detection and re-identification (ReID) are unconsciously overlooked because of treating them as two isolated tasks in the one-shot tracking paradigm.
Ranked #1 on Multi-Object Tracking on HiEve (using extra training data)
1 code implementation • 6 Aug 2020 • Zhipeng Zhang, Bing Li, Weiming Hu, Houwen Peng
We first build a look-up-table (LUT) with the ground-truth mask in the starting frame, and then retrieves the LUT to obtain an attention map for spatial constraints.
4 code implementations • ECCV 2020 • Zhipeng Zhang, Houwen Peng, Jianlong Fu, Bing Li, Weiming Hu
In this paper, we propose a novel object-aware anchor-free network to address this issue.
Ranked #2 on Visual Object Tracking on VOT2019
5 code implementations • CVPR 2019 • Zhipeng Zhang, Houwen Peng
Siamese networks have drawn great attention in visual tracking because of their balanced accuracy and speed.
Ranked #2 on Visual Object Tracking on VOT2017