Search Results for author: Zhipeng Zhang

Found 17 papers, 10 papers with code

Augment and Criticize: Exploring Informative Samples for Semi-Supervised Monocular 3D Object Detection

no code implementations20 Mar 2023 Zhenyu Li, Zhipeng Zhang, Heng Fan, Yuan He, Ke Wang, Xianming Liu, Junjun Jiang

In this paper, we improve the challenging monocular 3D object detection problem with a general semi-supervised framework.

One for All: One-stage Referring Expression Comprehension with Dynamic Reasoning

no code implementations31 Jul 2022 Zhipeng Zhang, Zhimin Wei, Zhongzhen Huang, Rui Niu, Peng Wang

However, one unsolved issue of these models is that the number of reasoning steps needs to be pre-defined and fixed before inference, ignoring the varying complexity of expressions.

Referring Expression Referring Expression Comprehension +2

Divert More Attention to Vision-Language Tracking

1 code implementation3 Jul 2022 Mingzhe Guo, Zhipeng Zhang, Heng Fan, Liping Jing

By revealing the potential of VL representation, we expect the community to divert more attention to VL tracking and hope to open more possibilities for future tracking beyond Transformer.

Object Tracking

Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information

no code implementations7 May 2022 Zhipeng Zhang, Xinglin Hou, Kai Niu, Zhongzhen Huang, Tiezheng Ge, Yuning Jiang, Qi Wu, Peng Wang

Therefore, we present a dataset, E-MMAD (e-commercial multimodal multi-structured advertisement copywriting), which requires, and supports much more detailed information in text generation.

Text Generation Video Captioning

Learning Target-aware Representation for Visual Tracking via Informative Interactions

no code implementations7 Jan 2022 Mingzhe Guo, Zhipeng Zhang, Heng Fan, Liping Jing, Yilin Lyu, Bing Li, Weiming Hu

The proposed GIM module and InBN mechanism are general and applicable to different backbone types including CNN and Transformer for improvements, as evidenced by our extensive experiments on multiple benchmarks.

Representation Learning Visual Tracking

VisEvent: Reliable Object Tracking via Collaboration of Frame and Event Flows

2 code implementations11 Aug 2021 Xiao Wang, Jianing Li, Lin Zhu, Zhipeng Zhang, Zhe Chen, Xin Li, YaoWei Wang, Yonghong Tian, Feng Wu

Different from visible cameras which record intensity images frame by frame, the biologically inspired event camera produces a stream of asynchronous and sparse events with much lower latency.

Object Tracking

Learn to Match: Automatic Matching Network Design for Visual Tracking

1 code implementation ICCV 2021 Zhipeng Zhang, Yihao Liu, Xiao Wang, Bing Li, Weiming Hu

Siamese tracking has achieved groundbreaking performance in recent years, where the essence is the efficient matching operator cross-correlation and its variants.

Visual Tracking

One More Check: Making "Fake Background" Be Tracked Again

1 code implementation19 Apr 2021 Chao Liang, Zhipeng Zhang, Xue Zhou, Bing Li, Weiming Hu

Eventually, it helps to reload the ``fake background'' and repair the broken tracklets.

Association Motion Forecasting +3

Reality Transform Adversarial Generators for Image Splicing Forgery Detection and Localization

no code implementations ICCV 2021 Xiuli Bi, Zhipeng Zhang, Bin Xiao

For detecting the tampered regions, a forgery localization generator GM is proposed based on a multi-decoder-single-task strategy.

Style Transfer

Rethinking the competition between detection and ReID in Multi-Object Tracking

4 code implementations23 Oct 2020 Chao Liang, Zhipeng Zhang, Xue Zhou, Bing Li, Shuyuan Zhu, Weiming Hu

However, the inherent differences and relations between detection and re-identification (ReID) are unconsciously overlooked because of treating them as two isolated tasks in the one-shot tracking paradigm.

 Ranked #1 on Multi-Object Tracking on HiEve (using extra training data)

Association Multi-Object Tracking

Towards Accurate Pixel-wise Object Tracking by Attention Retrieval

1 code implementation6 Aug 2020 Zhipeng Zhang, Bing Li, Weiming Hu, Houwen Peng

We first build a look-up-table (LUT) with the ground-truth mask in the starting frame, and then retrieves the LUT to obtain an attention map for spatial constraints.

Object Tracking Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.