1 code implementation • 18 Apr 2024 • Jin Gao, Shubo Lin, Shaoru Wang, Yutong Kou, Zeming Li, Liang Li, Congxuan Zhang, Xiaoqin Zhang, Yizheng Wang, Weiming Hu
In this paper, we question if the extremely simple ViTs' fine-tuning performance with a small-scale architecture can also benefit from this pre-training paradigm, which is considerably less studied yet in contrast to the well-established lightweight architecture design methodology with sophisticated components introduced.
1 code implementation • 12 Mar 2024 • Han Qiu, Jiaxing Huang, Peng Gao, Lewei Lu, Xiaoqin Zhang, Shijian Lu
Inspired by the success of general-purpose models in NLP, recent studies attempt to unify different vision tasks in the same sequence format and employ autoregressive Transformers for sequence prediction.
no code implementations • 29 Feb 2024 • Xueying Jiang, Sheng Jin, Lewei Lu, Xiaoqin Zhang, Shijian Lu
We propose SKD-WM3D, a weakly supervised monocular 3D detection framework that exploits depth information to achieve M3D with a single-view image exclusively without any 3D annotations or other training data.
no code implementations • 6 Feb 2024 • Aoran Xiao, Weihao Xuan, Heli Qi, Yun Xing, Ruijie Ren, Xiaoqin Zhang, Ling Shao, Shijian Lu
CAT-SAM freezes the entire SAM and adapts its mask decoder and image encoder simultaneously with a small number of learnable parameters.
1 code implementation • 14 Dec 2023 • Tangfei Liao, Xiaoqin Zhang, Li Zhao, Tao Wang, Guobao Xiao
Then, we model these visual cues and correspondences by a joint visual-spatial fusion module, simultaneously embedding visual cues into correspondences for pruning.
1 code implementation • 23 Nov 2023 • Jingjing Zheng, Wanglong Lu, Wenzhe Wang, Yankai Cao, Xiaoqin Zhang, Xianta Jiang
We develop a new optimization algorithm named the Alternating Proximal Multiplier Method (APMM) to iteratively solve the proposed tensor completion model.
no code implementations • 25 Sep 2023 • Ping Li, Yu Zhang, Li Yuan, Jian Zhao, Xianghua Xu, Xiaoqin Zhang
Particularly, the gradients from the segmentation model are exploited to discover the easily confused region, in which it is difficult to identify the pixel-wise objects from the background in a frame.
no code implementations • ICCV 2023 • Jiahui Zhang, Fangneng Zhan, Yingchen Yu, Kunhao Liu, Rongliang Wu, Xiaoqin Zhang, Ling Shao, Shijian Lu
However, as the pose estimator is trained with only rendered images, the pose estimation is usually biased or inaccurate for real images due to the domain gap between real images and rendered images, leading to poor robustness for the pose estimation of real images and further local minima in joint optimization.
no code implementations • ICCV 2023 • Muyu Xu, Fangneng Zhan, Jiahui Zhang, Yingchen Yu, Xiaoqin Zhang, Christian Theobalt, Ling Shao, Shijian Lu
Neural Radiance Field (NeRF) has shown impressive performance in novel view synthesis via implicit scene representation.
1 code implementation • 31 May 2023 • Aoran Xiao, Xiaoqin Zhang, Ling Shao, Shijian Lu
We address three critical questions in this emerging research field: i) the importance and urgency of label-efficient learning in point cloud processing, ii) the subfields it encompasses, and iii) the progress achieved in this area.
no code implementations • 19 May 2023 • Jingjing Zheng, Wenzhe Wang, Xiaoqin Zhang, Xianta Jiang
This study aims to solve the over-reliance on the rank estimation strategy in the standard tensor factorization-based tensor recovery and the problem of a large computational cost in the standard t-SVD-based tensor recovery.
no code implementations • 18 Apr 2023 • Rongliang Wu, Yingchen Yu, Fangneng Zhan, Jiahui Zhang, Xiaoqin Zhang, Shijian Lu
To accommodate fair variation of plausible facial animations for the same audio, we design a transformer-based probabilistic mapping network that can model the variational facial animation distribution conditioned upon the input audio and autoregressively convert the audio signals into a facial animation sequence.
no code implementations • 22 Dec 2022 • Tao Wang, Guangpin Tao, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Xiaoqin Zhang, Tong Lu
HCD consists of a hierarchical dehazing network (HDN) and a novel hierarchical contrastive loss (HCL).
no code implementations • CVPR 2023 • Gongjie Zhang, Zhipeng Luo, Zichen Tian, Jingyi Zhang, Xiaoqin Zhang, Shijian Lu
Multi-scale features have been proven highly effective for object detection but often come with huge and even prohibitive extra computation costs, especially for the recent Transformer-based detectors.
no code implementations • 4 Aug 2022 • Jiahui Zhang, Fangneng Zhan, Yingchen Yu, Rongliang Wu, Xiaoqin Zhang, Shijian Lu
In addition, stochastic noises fed to the generator are employed for unconditional detail generation, which tends to produce unfaithful details that compromise the fidelity of the generated SR image.
no code implementations • 6 Jul 2022 • Jiahui Zhang, Fangneng Zhan, Rongliang Wu, Yingchen Yu, Wenqing Zhang, Bai Song, Xiaoqin Zhang, Shijian Lu
With the feature transport plan as the guidance, a novel pose calibration technique is designed which rectifies the initially randomized camera poses by predicting relative pose transformations between the pair of rendered and real images.
no code implementations • CVPR 2023 • Jingyi Zhang, Jiaxing Huang, Xiaoqin Zhang, Shijian Lu
Domain adaptive panoptic segmentation aims to mitigate data annotation challenge by leveraging off-the-shelf annotated data in one or multiple related source domains.
Ranked #2 on Domain Adaptation on Panoptic SYNTHIA-to-Cityscapes
2 code implementations • 28 May 2022 • Shaoru Wang, Jin Gao, Zeming Li, Xiaoqin Zhang, Weiming Hu
We also point out some defects of such pre-training, e. g., failing to benefit from large-scale pre-training data and showing inferior performance on data-insufficient downstream tasks.
no code implementations • 11 May 2022 • Xiaoqin Zhang, Ziwei Huang, Jingjing Zheng, Shuo Wang, Xianta Jiang
The task of grasp pattern recognition aims to derive the applicable grasp types of an object according to the visual information.
1 code implementation • 29 Mar 2022 • Zhishe Wang, Wenyu Shao, Yanlin Chen, Jiawei Xu, Xiaoqin Zhang
The existing generative adversarial fusion methods generally concatenate source images and extract local features through convolution operation, without considering their global characteristics, which tends to produce an unbalanced result and is biased towards the infrared image or visible image.
1 code implementation • 28 Feb 2022 • Aoran Xiao, Jiaxing Huang, Dayan Guan, Xiaoqin Zhang, Shijian Lu, Ling Shao
The convergence of point cloud and DNNs has led to many deep point cloud models, largely trained under the supervision of large-scale and densely-labelled point cloud data.
no code implementations • 4 Sep 2021 • Caixia Yan, Xiaojun Chang, Minnan Luo, Huan Liu, Xiaoqin Zhang, Qinghua Zheng
To address these issues, we develop a novel Semantics-Guided Contrastive Network for ZSD, named ContrastZSD, a detection framework that first brings contrastive learning mechanism into the realm of zero-shot detection.
Ranked #4 on Zero-Shot Object Detection on MS-COCO
Contrastive Learning Generalized Zero-Shot Object Detection +3
no code implementations • CVPR 2023 • Jingyi Zhang, Jiaxing Huang, Zhipeng Luo, Gongjie Zhang, Xiaoqin Zhang, Shijian Lu
DA-DETR introduces a novel CNN-Transformer Blender (CTBlender) that fuses the CNN features and Transformer features ingeniously for effective feature alignment and knowledge transfer across domains.
no code implementations • 9 Feb 2021 • Linwei Ye, Mrigank Rochan, Zhi Liu, Xiaoqin Zhang, Yang Wang
In this paper, we propose a cross-modal self-attention (CMSA) module to utilize fine details of individual words and the input image or video, which effectively captures the long-range dependencies between linguistic and visual features.
Ranked #5 on Referring Expression Segmentation on J-HMDB (Precision@0.9 metric)
no code implementations • 24 Sep 2020 • Caixia Yan, Xiaojun Chang, Minnan Luo, Qinghua Zheng, Xiaoqin Zhang, Zhihui Li, Feiping Nie
In this regard, a novel self-weighted robust LDA with l21-norm based pairwise between-class distance criterion, called SWRLDA, is proposed for multi-class classification especially with edge classes.
no code implementations • 9 May 2019 • Xiaoqin Zhang, Yunfei Li, Huimin Ma, Xiong Luo
Pretraining reinforcement learning methods with demonstrations has been an important concept in the study of reinforcement learning since a large amount of computing power is spent on online simulations with existing reinforcement learning algorithms.
no code implementations • 31 Jan 2018 • Xiaoqin Zhang, Huimin Ma
We apply our method to two of the typical actor-critic reinforcement learning algorithms, DDPG and ACER, and demonstrate with experiments that our method not only outperforms the RL algorithms without pretraining process, but also is more simulation efficient.
no code implementations • 30 Apr 2016 • Shaobo Lin, Jinshan Zeng, Xiaoqin Zhang
In this paper, we aim at developing scalable neural network-type learning systems.
no code implementations • ICCV 2015 • Lin Ma, Xiaoqin Zhang, Weiming Hu, Junliang Xing, Jiwen Lu, Jie zhou
To address this, this paper presents a local subspace collaborative tracking method for robust visual tracking, where multiple linear and nonlinear subspaces are learned to better model the nonlinear relationship of object appearances.
no code implementations • 26 Sep 2014 • Wenhan Luo, Junliang Xing, Anton Milan, Xiaoqin Zhang, Wei Liu, Tae-Kyun Kim
We inspect the recent advances in various aspects and propose some interesting directions for future research.
no code implementations • NeurIPS 2013 • Xiaoqin Zhang, Di Wang, Zhengyuan Zhou, Yi Ma
In this context, the state-of-the-art algorithms RASL'' and "TILT'' can be viewed as two special cases of our work, and yet each only performs part of the function of our method."