Search Results for author: Xiaoqin Zhang

Found 31 papers, 8 papers with code

Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training

1 code implementation • 18 Apr 2024 • Jin Gao, Shubo Lin, Shaoru Wang, Yutong Kou, Zeming Li, Liang Li, Congxuan Zhang, Xiaoqin Zhang, Yizheng Wang, Weiming Hu

In this paper, we question if the extremely simple ViTs' fine-tuning performance with a small-scale architecture can also benefit from this pre-training paradigm, which is considerably less studied yet in contrast to the well-established lightweight architecture design methodology with sophisticated components introduced.

Contrastive Learning Image Classification +2

100

Paper
Code

Masked AutoDecoder is Effective Multi-Task Vision Generalist

1 code implementation • 12 Mar 2024 • Han Qiu, Jiaxing Huang, Peng Gao, Lewei Lu, Xiaoqin Zhang, Shijian Lu

Inspired by the success of general-purpose models in NLP, recent studies attempt to unify different vision tasks in the same sequence format and employ autoregressive Transformers for sequence prediction.

Paper
Code

Weakly Supervised Monocular 3D Detection with a Single-View Image

no code implementations • 29 Feb 2024 • Xueying Jiang, Sheng Jin, Lewei Lu, Xiaoqin Zhang, Shijian Lu

We propose SKD-WM3D, a weakly supervised monocular 3D detection framework that exploits depth information to achieve M3D with a single-view image exclusively without any 3D annotations or other training data.

Object Localization Self-Knowledge Distillation +1

Paper
Add Code

Conditional Tuning Network for Few-Shot Adaptation of Segmentation Anything Model

no code implementations • 6 Feb 2024 • Aoran Xiao, Weihao Xuan, Heli Qi, Yun Xing, Ruijie Ren, Xiaoqin Zhang, Ling Shao, Shijian Lu

CAT-SAM freezes the entire SAM and adapts its mask decoder and image encoder simultaneously with a small number of learnable parameters.

Decoder Image Segmentation +1

Paper
Add Code

VSFormer: Visual-Spatial Fusion Transformer for Correspondence Pruning

1 code implementation • 14 Dec 2023 • Tangfei Liao, Xiaoqin Zhang, Li Zhao, Tao Wang, Guobao Xiao

Then, we model these visual cues and correspondences by a joint visual-spatial fusion module, simultaneously embedding visual cues into correspondences for pruning.

Paper
Code

Handling The Non-Smooth Challenge in Tensor SVD: A Multi-Objective Tensor Recovery Framework

1 code implementation • 23 Nov 2023 • Jingjing Zheng, Wanglong Lu, Wenzhe Wang, Yankai Cao, Xiaoqin Zhang, Xianta Jiang

We develop a new optimization algorithm named the Alternating Proximal Multiplier Method (APMM) to iteratively solve the proposed tensor completion model.

Tensor Decomposition

Paper
Code

Adversarial Attacks on Video Object Segmentation with Hard Region Discovery

no code implementations • 25 Sep 2023 • Ping Li, Yu Zhang, Li Yuan, Jian Zhao, Xianghua Xu, Xiaoqin Zhang

Particularly, the gradients from the segmentation model are exploited to discover the easily confused region, in which it is difficult to identify the pixel-wise objects from the background in a frame.

Autonomous Driving Object +5

Paper
Add Code

Pose-Free Neural Radiance Fields via Implicit Pose Regularization

no code implementations • ICCV 2023 • Jiahui Zhang, Fangneng Zhan, Yingchen Yu, Kunhao Liu, Rongliang Wu, Xiaoqin Zhang, Ling Shao, Shijian Lu

However, as the pose estimator is trained with only rendered images, the pose estimation is usually biased or inaccurate for real images due to the domain gap between real images and rendered images, leading to poor robustness for the pose estimation of real images and further local minima in joint optimization.

Novel View Synthesis Pose Estimation

Paper
Add Code

WaveNeRF: Wavelet-based Generalizable Neural Radiance Fields

no code implementations • ICCV 2023 • Muyu Xu, Fangneng Zhan, Jiahui Zhang, Yingchen Yu, Xiaoqin Zhang, Christian Theobalt, Ling Shao, Shijian Lu

Neural Radiance Field (NeRF) has shown impressive performance in novel view synthesis via implicit scene representation.

Novel View Synthesis

Paper
Add Code

A Survey of Label-Efficient Deep Learning for 3D Point Clouds

1 code implementation • 31 May 2023 • Aoran Xiao, Xiaoqin Zhang, Ling Shao, Shijian Lu

We address three critical questions in this emerging research field: i) the importance and urgency of label-efficient learning in point cloud processing, ii) the subfields it encompasses, and iii) the progress achieved in this area.

Data Augmentation Efficient Exploration +2

Paper
Code

A Novel Tensor Factorization-Based Method with Robustness to Inaccurate Rank Estimation

no code implementations • 19 May 2023 • Jingjing Zheng, Wenzhe Wang, Xiaoqin Zhang, Xianta Jiang

This study aims to solve the over-reliance on the rank estimation strategy in the standard tensor factorization-based tensor recovery and the problem of a large computational cost in the standard t-SVD-based tensor recovery.

Paper
Add Code

Audio-Driven Talking Face Generation with Diverse yet Realistic Facial Animations

no code implementations • 18 Apr 2023 • Rongliang Wu, Yingchen Yu, Fangneng Zhan, Jiahui Zhang, Xiaoqin Zhang, Shijian Lu

To accommodate fair variation of plausible facial animations for the same audio, we design a transformer-based probabilistic mapping network that can model the variational facial animation distribution conditioned upon the input audio and autoregressively convert the audio signals into a facial animation sequence.

Talking Face Generation

Paper
Add Code

Restoring Vision in Hazy Weather with Hierarchical Contrastive Learning

no code implementations • 22 Dec 2022 • Tao Wang, Guangpin Tao, Wanglong Lu, Kaihao Zhang, Wenhan Luo, Xiaoqin Zhang, Tong Lu

HCD consists of a hierarchical dehazing network (HDN) and a novel hierarchical contrastive loss (HCL).

Contrastive Learning Image Dehazing +3

Paper
Add Code

Towards Efficient Use of Multi-Scale Features in Transformer-Based Object Detectors

no code implementations • CVPR 2023 • Gongjie Zhang, Zhipeng Luo, Zichen Tian, Jingyi Zhang, Xiaoqin Zhang, Shijian Lu

Multi-scale features have been proven highly effective for object detection but often come with huge and even prohibitive extra computation costs, especially for the recent Transformer-based detectors.

Decoder Object +2

Paper
Add Code

Latent Multi-Relation Reasoning for GAN-Prior based Image Super-Resolution

no code implementations • 4 Aug 2022 • Jiahui Zhang, Fangneng Zhan, Yingchen Yu, Rongliang Wu, Xiaoqin Zhang, Shijian Lu

In addition, stochastic noises fed to the generator are employed for unconditional detail generation, which tends to produce unfaithful details that compromise the fidelity of the generated SR image.

Attribute Code Generation +3

Paper
Add Code

VMRF: View Matching Neural Radiance Fields

no code implementations • 6 Jul 2022 • Jiahui Zhang, Fangneng Zhan, Rongliang Wu, Yingchen Yu, Wenqing Zhang, Bai Song, Xiaoqin Zhang, Shijian Lu

With the feature transport plan as the guidance, a novel pose calibration technique is designed which rectifies the initially randomized camera poses by predicting relative pose transformations between the pair of rendered and real images.

Novel View Synthesis

Paper
Add Code

UniDAformer: Unified Domain Adaptive Panoptic Segmentation Transformer via Hierarchical Mask Calibration

no code implementations • CVPR 2023 • Jingyi Zhang, Jiaxing Huang, Xiaoqin Zhang, Shijian Lu

Domain adaptive panoptic segmentation aims to mitigate data annotation challenge by leveraging off-the-shelf annotated data in one or multiple related source domains.

Ranked #2 on Domain Adaptation on Panoptic SYNTHIA-to-Cityscapes

Domain Adaptation Instance Segmentation +3

Paper
Add Code

A Closer Look at Self-Supervised Lightweight Vision Transformers

2 code implementations • 28 May 2022 • Shaoru Wang, Jin Gao, Zeming Li, Xiaoqin Zhang, Weiming Hu

We also point out some defects of such pre-training, e. g., failing to benefit from large-scale pre-training data and showing inferior performance on data-insufficient downstream tasks.

Contrastive Learning Image Classification +1

100

Paper
Code

DcnnGrasp: Towards Accurate Grasp Pattern Recognition with Adaptive Regularizer Learning

no code implementations • 11 May 2022 • Xiaoqin Zhang, Ziwei Huang, Jingjing Zheng, Shuo Wang, Xianta Jiang

The task of grasp pattern recognition aims to derive the applicable grasp types of an object according to the visual information.

Object

Paper
Add Code

Infrared and Visible Image Fusion via Interactive Compensatory Attention Adversarial Learning

1 code implementation • 29 Mar 2022 • Zhishe Wang, Wenyu Shao, Yanlin Chen, Jiawei Xu, Xiaoqin Zhang

The existing generative adversarial fusion methods generally concatenate source images and extract local features through convolution operation, without considering their global characteristics, which tends to produce an unbalanced result and is biased towards the infrared image or visible image.

Decoder Infrared And Visible Image Fusion

Paper
Code

Unsupervised Point Cloud Representation Learning with Deep Neural Networks: A Survey

1 code implementation • 28 Feb 2022 • Aoran Xiao, Jiaxing Huang, Dayan Guan, Xiaoqin Zhang, Shijian Lu, Ling Shao

The convergence of point cloud and DNNs has led to many deep point cloud models, largely trained under the supervision of large-scale and densely-labelled point cloud data.

Autonomous Driving Representation Learning

181

Paper
Code

Semantics-Guided Contrastive Network for Zero-Shot Object detection

no code implementations • 4 Sep 2021 • Caixia Yan, Xiaojun Chang, Minnan Luo, Huan Liu, Xiaoqin Zhang, Qinghua Zheng

To address these issues, we develop a novel Semantics-Guided Contrastive Network for ZSD, named ContrastZSD, a detection framework that first brings contrastive learning mechanism into the realm of zero-shot detection.

Ranked #4 on Zero-Shot Object Detection on MS-COCO

Contrastive Learning Generalized Zero-Shot Object Detection +3

Paper
Add Code

DA-DETR: Domain Adaptive Detection Transformer with Information Fusion

no code implementations • CVPR 2023 • Jingyi Zhang, Jiaxing Huang, Zhipeng Luo, Gongjie Zhang, Xiaoqin Zhang, Shijian Lu

DA-DETR introduces a novel CNN-Transformer Blender (CTBlender) that fuses the CNN features and Transformer features ingeniously for effective feature alignment and knowledge transfer across domains.

Domain Adaptation Object +3

Paper
Add Code

Referring Segmentation in Images and Videos with Cross-Modal Self-Attention Network

no code implementations • 9 Feb 2021 • Linwei Ye, Mrigank Rochan, Zhi Liu, Xiaoqin Zhang, Yang Wang

In this paper, we propose a cross-modal self-attention (CMSA) module to utilize fine details of individual words and the input image or video, which effectively captures the long-range dependencies between linguistic and visual features.

Ranked #5 on Referring Expression Segmentation on J-HMDB (Precision@0.9 metric)

Referring Expression Referring Expression Segmentation +3

Paper
Add Code

Self-Weighted Robust LDA for Multiclass Classification with Edge Classes

no code implementations • 24 Sep 2020 • Caixia Yan, Xiaojun Chang, Minnan Luo, Qinghua Zheng, Xiaoqin Zhang, Zhihui Li, Feiping Nie

In this regard, a novel self-weighted robust LDA with l21-norm based pairwise between-class distance criterion, called SWRLDA, is proposed for multi-class classification especially with edge classes.

Classification Computational Efficiency +2

Paper
Add Code

Pretrain Soft Q-Learning with Imperfect Demonstrations

no code implementations • 9 May 2019 • Xiaoqin Zhang, Yunfei Li, Huimin Ma, Xiong Luo

Pretraining reinforcement learning methods with demonstrations has been an important concept in the study of reinforcement learning since a large amount of computing power is spent on online simulations with existing reinforcement learning algorithms.

Q-Learning reinforcement-learning +1

Paper
Add Code

Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations

no code implementations • 31 Jan 2018 • Xiaoqin Zhang, Huimin Ma

We apply our method to two of the typical actor-critic reinforcement learning algorithms, DDPG and ACER, and demonstrate with experiments that our method not only outperforms the RL algorithms without pretraining process, but also is more simulation efficient.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Constructive neural network learning

no code implementations • 30 Apr 2016 • Shaobo Lin, Jinshan Zeng, Xiaoqin Zhang

In this paper, we aim at developing scalable neural network-type learning systems.

Paper
Add Code

Local Subspace Collaborative Tracking

no code implementations • ICCV 2015 • Lin Ma, Xiaoqin Zhang, Weiming Hu, Junliang Xing, Jiwen Lu, Jie zhou

To address this, this paper presents a local subspace collaborative tracking method for robust visual tracking, where multiple linear and nonlinear subspaces are learned to better model the nonlinear relationship of object appearances.

Object Object Tracking +1

Paper
Add Code

Multiple Object Tracking: A Literature Review

no code implementations • 26 Sep 2014 • Wenhan Luo, Junliang Xing, Anton Milan, Xiaoqin Zhang, Wei Liu, Tae-Kyun Kim

We inspect the recent advances in various aspects and propose some interesting directions for future research.

Multiple Object Tracking Object

Paper
Add Code

Simultaneous Rectification and Alignment via Robust Recovery of Low-rank Tensors

no code implementations • NeurIPS 2013 • Xiaoqin Zhang, Di Wang, Zhengyuan Zhou, Yi Ma

In this context, the state-of-the-art algorithms RASL'' and "TILT'' can be viewed as two special cases of our work, and yet each only performs part of the function of our method."

Computational Efficiency

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.