1 code implementation • 18 Apr 2024 • Jin Gao, Shubo Lin, Shaoru Wang, Yutong Kou, Zeming Li, Liang Li, Congxuan Zhang, Xiaoqin Zhang, Yizheng Wang, Weiming Hu
In this paper, we question if the \textit{extremely simple} lightweight ViTs' fine-tuning performance can also benefit from this pre-training paradigm, which is considerably less studied yet in contrast to the well-established lightweight architecture design methodology.
no code implementations • 13 Jul 2022 • Shaoru Wang, Zeming Li, Jin Gao, Liang Li, Weiming Hu
However, when facing various resource budgets in real-world applications, it costs a huge computation burden to pretrain multiple networks of various sizes one by one.
1 code implementation • 12 Jun 2022 • Shaoru Wang, Jin Gao, Bing Li, Weiming Hu
Experiments for both synthesized and real-world scenarios consistently demonstrate the effectiveness of our approach, e. g., our method increases the degraded performance of the FCOS detector from 33. 6% AP to 35. 6% AP on COCO.
2 code implementations • 28 May 2022 • Shaoru Wang, Jin Gao, Zeming Li, Xiaoqin Zhang, Weiming Hu
We also point out some defects of such pre-training, e. g., failing to benefit from large-scale pre-training data and showing inferior performance on data-insufficient downstream tasks.
1 code implementation • CVPR 2022 • Zongyang Ma, Guan Luo, Jin Gao, Liang Li, Yuxin Chen, Shaoru Wang, Congxuan Zhang, Weiming Hu
Open-vocabulary object detection aims to detect novel object categories beyond the training set.
Ranked #30 on Open Vocabulary Object Detection on MSCOCO
no code implementations • 6 May 2021 • Zhenbang Li, Yaya Shi, Jin Gao, Shaoru Wang, Bing Li, Pengpeng Liang, Weiming Hu
In this paper, we show the existence of universal perturbations that can enable the targeted attack, e. g., forcing a tracker to follow the ground-truth trajectory with specified offsets, to be video-agnostic and free from inference in a network.
1 code implementation • 28 Apr 2021 • Li Yang, Yan Xu, Shaoru Wang, Chunfeng Yuan, Ziqi Zhang, Bing Li, Weiming Hu
However, the most suitable positions for inferring different targets, i. e., the object category and boundaries, are generally different.
1 code implementation • 11 Dec 2019 • Shaoru Wang, Yongchao Gong, Junliang Xing, Lichao Huang, Chang Huang, Weiming Hu
To reciprocate these two tasks, we design a two-stream structure to learn features on both the object level (i. e., bounding boxes) and the pixel level (i. e., instance masks) jointly.
Ranked #95 on Instance Segmentation on COCO test-dev