1 code implementation • 10 Oct 2024 • Dingkang Liang, Tianrui Feng, Xin Zhou, Yumeng Zhang, Zhikang Zou, Xiang Bai
PointGST freezes the pre-trained model and introduces a lightweight, trainable Point Cloud Spectral Adapter (PCSA) to fine-tune parameters in the spectral domain.
3D Parameter-Efficient Fine-Tuning for Classification 3D Point Cloud Classification +3
no code implementations • 30 Sep 2024 • Yubin Wang, Zhikang Zou, Xiaoqing Ye, Xiao Tan, Errui Ding, Cairong Zhao
We present Uni$^2$Det, a brand new framework for unified and universal multi-dataset training on 3D detection, enabling robust performance across diverse domains and generalization to unseen domains.
1 code implementation • 9 Jul 2024 • Jiankun Li, Hao Li, JiangJiang Liu, Zhikang Zou, Xiaoqing Ye, Fan Wang, Jizhou Huang, Hua Wu, Haifeng Wang
Deep learning-based models are widely deployed in autonomous driving areas, especially the increasingly noticed end-to-end solutions.
1 code implementation • 1 Jul 2024 • Dingkang Liang, Wei Hua, Chunsheng Shi, Zhikang Zou, Xiaoqing Ye, Xiang Bai
Specifically, we observe that objects from aerial images are usually arbitrary orientations, small scales, and aggregation, which inspires the following core designs: a Simple Instance-aware Dense Sampling (SIDS) strategy is used to generate comprehensive dense pseudo-labels; the Geometry-aware Adaptive Weighting (GAW) loss dynamically modulates the importance of each pair between pseudo-label and corresponding prediction by leveraging the intricate geometric information of aerial objects; we treat aerial images as global layouts and explicitly build the many-to-many relationship between the sets of pseudo-labels and predictions via the proposed Noise-driven Global Consistency (NGC).
1 code implementation • CVPR 2024 • Xin Zhou, Dingkang Liang, Wei Xu, Xingkui Zhu, Yihan Xu, Zhikang Zou, Xiang Bai
To achieve this goal, we freeze the parameters of the default pre-trained models and then propose the Dynamic Adapter, which generates a dynamic scale for each token, considering the token significance to the downstream task.
3D Parameter-Efficient Fine-Tuning for Classification Transfer Learning
1 code implementation • 27 Feb 2024 • Hongcheng Yang, Dingkang Liang, Dingyuan Zhang, Zhe Liu, Zhikang Zou, Xingyu Jiang, Yingying Zhu
For such purpose, this paper presents an advanced sampler that achieves both high accuracy and efficiency.
Ranked #3 on 3D Part Segmentation on ShapeNet-Part
1 code implementation • 16 Feb 2024 • Dingkang Liang, Xin Zhou, Wei Xu, Xingkui Zhu, Zhikang Zou, Xiaoqing Ye, Xiao Tan, Xiang Bai
Unlike traditional Transformers, PointMamba employs a linear complexity algorithm, presenting global modeling capacity while significantly reducing computational costs.
no code implementations • 5 Sep 2023 • Xin Zhou, Jinghua Hou, Tingting Yao, Dingkang Liang, Zhe Liu, Zhikang Zou, Xiaoqing Ye, Jianwei Cheng, Xiang Bai
3D object detection is an essential task for achieving autonomous driving.
no code implementations • 6 Jul 2023 • Jincheng Lu, Xipeng Yang, Jin Ye, Yifu Zhang, Zhikang Zou, Wei zhang, Xiao Tan
Targets in urban traffic scenes often undergo occlusion, illumination changes, and perspective changes, making it difficult to associate targets across different cameras accurately.
no code implementations • 19 Jun 2023 • Xianhui Cheng, Shoumeng Qiu, Zhikang Zou, Jian Pu, xiangyang xue
In this paper, we propose a framework named the Adaptive Distance Interval Separation Network (ADISN) that adopts a novel perspective on understanding depth maps, as a form that lies between LiDAR and images.
1 code implementation • 4 Jun 2023 • Dingyuan Zhang, Dingkang Liang, Hongcheng Yang, Zhikang Zou, Xiaoqing Ye, Zhe Liu, Xiang Bai
In the spirit of unleashing the capability of foundation models on vision tasks, the Segment Anything Model (SAM), a vision foundation model for image segmentation, has been proposed recently and presents strong zero-shot ability on many downstream 2D tasks.
1 code implementation • 12 May 2023 • Zhe Liu, Xiaoqing Ye, Zhikang Zou, Xinwei He, Xiao Tan, Errui Ding, Jingdong Wang, Xiang Bai
Extensive experiments on the nuScenes dataset demonstrate that our method is much more stable in dealing with challenging cases such as asynchronous sensors, misaligned sensor placement, and degenerated camera images than existing fusion methods.
Ranked #48 on 3D Object Detection on nuScenes
1 code implementation • CVPR 2023 • Wei Hua, Dingkang Liang, Jingyu Li, Xiaolong Liu, Zhikang Zou, Xiaoqing Ye, Xiang Bai
Semi-Supervised Object Detection (SSOD), aiming to explore unlabeled data for boosting object detectors, has become an active task in recent years.
2 code implementations • CVPR 2023 • Dingkang Liang, Jiahao Xie, Zhikang Zou, Xiaoqing Ye, Wei Xu, Xiang Bai
To the best of our knowledge, CrowdCLIP is the first to investigate the vision language knowledge to solve the counting problem.
Ranked #1 on Cross-Part Crowd Counting on ShanghaiTech B
no code implementations • ICCV 2023 • Dingyuan Zhang, Dingkang Liang, Zhikang Zou, Jingyu Li, Xiaoqing Ye, Zhe Liu, Xiao Tan, Xiang Bai
Advanced 3D object detection methods usually rely on large-scale, elaborately labeled datasets to achieve good performance.
no code implementations • 11 Oct 2022 • Yue He, Minyue Jiang, Xiaoqing Ye, Liang Du, Zhikang Zou, Wei zhang, Xiao Tan, Errui Ding
In this paper, we target at finding an enhanced feature space where the lane features are distinctive while maintaining a similar distribution of lanes in the wild.
no code implementations • 12 Jul 2022 • Bo Ju, Zhikang Zou, Xiaoqing Ye, Minyue Jiang, Xiao Tan, Errui Ding, Jingdong Wang
In this work, we propose a novel semantic passing framework, named SPNet, to boost the performance of existing lidar-based 3D detection models with the guidance of rich context painting, with no extra computation cost during inference.
no code implementations • ICCV 2021 • Zhikang Zou, Xiaoqing Ye, Liang Du, Xianhui Cheng, Xiao Tan, Li Zhang, Jianfeng Feng, xiangyang xue, Errui Ding
Low-cost monocular 3D object detection plays a fundamental role in autonomous driving, whereas its accuracy is still far from satisfactory.
1 code implementation • 3 Dec 2021 • Zheyuan Zhou, Liang Du, Xiaoqing Ye, Zhikang Zou, Xiao Tan, Li Zhang, xiangyang xue, Jianfeng Feng
Monocular 3D object detection aims to predict the object location, dimension and orientation in 3D space alongside the object category given only a monocular image.
no code implementations • 18 Sep 2021 • Jian Hu, Hongya Tuo, Shizhao Zhang, Chao Wang, Haowen Zhong, Zhikang Zou, Zhongliang Jing, Henry Leung, Ruping Zou
Partial Domain adaptation (PDA) aims to solve a more practical cross-domain learning problem that assumes target label space is a subset of source label space.
no code implementations • 27 Jul 2021 • Zhikang Zou, Xiaoye Qu, Pan Zhou, Shuangjie Xu, Xiaoqing Ye, Wenhao Wu, Jin Ye
In specific, at the coarse-grained stage, we design a dual-discriminator strategy to adapt source domain to be close to the targets from the perspectives of both global and local feature space via adversarial learning.
1 code implementation • 25 May 2021 • Wenhao Wu, Yuxiang Zhao, Yanwu Xu, Xiao Tan, Dongliang He, Zhikang Zou, Jin Ye, YingYing Li, Mingde Yao, ZiChao Dong, Yifeng Shi
Long-range and short-range temporal modeling are two complementary and crucial aspects of video recognition.
Ranked #6 on Action Recognition on ActivityNet
1 code implementation • 22 Apr 2021 • Qiming Wu, Zhikang Zou, Pan Zhou, Xiaoqing Ye, Binghui Wang, Ang Li
Crowd counting has drawn much attention due to its importance in safety-critical surveillance systems.
no code implementations • ICCV 2021 • Zhi Chen, Xiaoqing Ye, Wei Yang, Zhenbo Xu, Xiao Tan, Zhikang Zou, Errui Ding, Xinming Zhang, Liusheng Huang
Second, we introduce an occlusion-aware distillation (OA Distillation) module, which leverages the predicted depths from StereoNet in non-occluded regions to train our monocular depth estimation network named SingleNet.
no code implementations • 7 Mar 2020 • Zhikang Zou, Yifan Liu, Shuangjie Xu, Wei Wei, Shiping Wen, Pan Zhou
Extensive experiments on crowd counting datasets (ShanghaiTech, MALL, WorldEXPO'10, and UCSD) show that our HSRNet can deliver superior results over all state-of-the-art approaches.
no code implementations • 12 Aug 2019 • Zhikang Zou, Huiliang Shao, Xiaoye Qu, Wei Wei, Pan Zhou
Recently, convolutional neural networks (CNNs) are the leading defacto method for crowd counting.
no code implementations • 7 Aug 2019 • Zhikang Zou, Yu Cheng, Xiaoye Qu, Shouling Ji, Xiaoxiao Guo, Pan Zhou
ACM-CNN consists of three types of modules: a coarse network, a fine network, and a smooth network.
no code implementations • NAACL 2019 • Xiaoye Qu, Zhikang Zou, Yu Cheng, Yang Yang, Pan Zhou
Cross-domain sentiment classification aims to predict sentiment polarity on a target domain utilizing a classifier learned from a source domain.