1 code implementation • 10 Oct 2024 • Dingkang Liang, Tianrui Feng, Xin Zhou, Yumeng Zhang, Zhikang Zou, Xiang Bai
PointGST freezes the pre-trained model and introduces a lightweight, trainable Point Cloud Spectral Adapter (PCSA) to fine-tune parameters in the spectral domain.
3D Parameter-Efficient Fine-Tuning for Classification 3D Point Cloud Classification +3
1 code implementation • 1 Sep 2024 • Dingyuan Zhang, Dingkang Liang, Zichang Tan, Xiaoqing Ye, Cheng Zhang, Jingdong Wang, Xiang Bai
Slow inference speed is one of the most crucial concerns for deploying multi-view 3D detectors to tasks with high real-time requirements like autonomous driving.
1 code implementation • 4 Aug 2024 • Mingxin Huang, Yuliang Liu, Dingkang Liang, Lianwen Jin, Xiang Bai
To address this issue, we introduce a Complementary Image Pyramid (CIP), a simple, effective, and plug-and-play solution designed to mitigate semantic discontinuity during high-resolution image processing.
1 code implementation • 3 Jul 2024 • Wei Xu, Chunsheng Shi, Sifan Tu, Xin Zhou, Dingkang Liang, Xiang Bai
We propose UniSeg3D, a unified 3D scene understanding framework that achieves panoptic, semantic, instance, interactive, referring, and open-vocabulary segmentation tasks within a single model.
1 code implementation • 1 Jul 2024 • Dingkang Liang, Wei Hua, Chunsheng Shi, Zhikang Zou, Xiaoqing Ye, Xiang Bai
Specifically, we observe that objects from aerial images are usually arbitrary orientations, small scales, and aggregation, which inspires the following core designs: a Simple Instance-aware Dense Sampling (SIDS) strategy is used to generate comprehensive dense pseudo-labels; the Geometry-aware Adaptive Weighting (GAW) loss dynamically modulates the importance of each pair between pseudo-label and corresponding prediction by leveraging the intricate geometric information of aerial objects; we treat aerial images as global layouts and explicitly build the many-to-many relationship between the sets of pseudo-labels and predictions via the proposed Noise-driven Global Consistency (NGC).
1 code implementation • 7 Jun 2024 • Xingkui Zhu, Yiran Guan, Dingkang Liang, Yuchao Chen, Yuliang Liu, Xiang Bai
The sparsely activated mixture of experts (MoE) model presents a promising alternative to traditional densely activated (dense) models, enhancing both quality and computational efficiency.
no code implementations • 14 Mar 2024 • Yuxuan Cai, Xinwei He, Dingkang Liang, Ao Tong, Xiang Bai
Recently, large vision and language models have shown their success when adapting them to many downstream tasks.
1 code implementation • CVPR 2024 • Xin Zhou, Dingkang Liang, Wei Xu, Xingkui Zhu, Yihan Xu, Zhikang Zou, Xiang Bai
To achieve this goal, we freeze the parameters of the default pre-trained models and then propose the Dynamic Adapter, which generates a dynamic scale for each token, considering the token significance to the downstream task.
3D Parameter-Efficient Fine-Tuning for Classification Transfer Learning
1 code implementation • 27 Feb 2024 • Hongcheng Yang, Dingkang Liang, Dingyuan Zhang, Zhe Liu, Zhikang Zou, Xingyu Jiang, Yingying Zhu
For such purpose, this paper presents an advanced sampler that achieves both high accuracy and efficiency.
Ranked #2 on 3D Part Segmentation on ShapeNet-Part
1 code implementation • 16 Feb 2024 • Dingkang Liang, Xin Zhou, Wei Xu, Xingkui Zhu, Zhikang Zou, Xiaoqing Ye, Xiao Tan, Xiang Bai
Unlike traditional Transformers, PointMamba employs a linear complexity algorithm, presenting global modeling capacity while significantly reducing computational costs.
no code implementations • 27 Jan 2024 • Kaixin Xiong, Dingyuan Zhang, Dingkang Liang, Zhe Liu, Hongcheng Yang, Wondimu Dikubab, Jianwei Cheng, Xiang Bai
Monocular 3D Object Detection is an essential task for autonomous driving.
1 code implementation • 11 Oct 2023 • Yuxuan Cai, Dingkang Liang, Dongliang Luo, Xinwei He, Xin Yang, Xiang Bai
To alleviate this issue, we present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies across different anomaly detection benchmarks.
no code implementations • 5 Sep 2023 • Xin Zhou, Jinghua Hou, Tingting Yao, Dingkang Liang, Zhe Liu, Zhikang Zou, Xiaoqing Ye, Jianwei Cheng, Xiang Bai
3D object detection is an essential task for achieving autonomous driving.
1 code implementation • 4 Jun 2023 • Dingyuan Zhang, Dingkang Liang, Hongcheng Yang, Zhikang Zou, Xiaoqing Ye, Zhe Liu, Xiang Bai
In the spirit of unleashing the capability of foundation models on vision tasks, the Segment Anything Model (SAM), a vision foundation model for image segmentation, has been proposed recently and presents strong zero-shot ability on many downstream 2D tasks.
1 code implementation • 12 May 2023 • Jianfeng Kuang, Wei Hua, Dingkang Liang, Mingkun Yang, Deqiang Jiang, Bo Ren, Xiang Bai
We evaluate the existing end-to-end methods for VIE on the proposed dataset and observe that the performance of these methods has a distinguishable drop from SROIE (a widely used English dataset) to our proposed dataset due to the larger variance of layout and entities.
1 code implementation • CVPR 2023 • Wei Hua, Dingkang Liang, Jingyu Li, Xiaolong Liu, Zhikang Zou, Xiaoqing Ye, Xiang Bai
Semi-Supervised Object Detection (SSOD), aiming to explore unlabeled data for boosting object detectors, has become an active task in recent years.
2 code implementations • CVPR 2023 • Dingkang Liang, Jiahao Xie, Zhikang Zou, Xiaoqing Ye, Wei Xu, Xiang Bai
To the best of our knowledge, CrowdCLIP is the first to investigate the vision language knowledge to solve the counting problem.
Ranked #1 on Cross-Part Crowd Counting on ShanghaiTech B
1 code implementation • 13 Mar 2023 • Jiahao Xie, Wei Xu, Dingkang Liang, Zhanyu Ma, Kongming Liang, Weidong Liu, Rui Wang, Ling Jin
As the proposed method requires SR labels, we further propose a Super-Resolution Crowd Counting dataset (SR-Crowd).
1 code implementation • 9 Mar 2023 • Jingyu Li, Zhe Liu, Jinghua Hou, Dingkang Liang
In this paper, we present a simple yet effective semi-supervised 3D object detector named DDS3D.
no code implementations • ICCV 2023 • Dingyuan Zhang, Dingkang Liang, Zhikang Zou, Jingyu Li, Xiaoqing Ye, Zhe Liu, Xiao Tan, Xiang Bai
Advanced 3D object detection methods usually rely on large-scale, elaborately labeled datasets to achieve good performance.
3 code implementations • 23 Jul 2022 • Bohan Li, Ye Yuan, Dingkang Liang, Xiao Liu, Zhilong Ji, Jinfeng Bai, Wenyu Liu, Xiang Bai
Recently, most handwritten mathematical expression recognition (HMER) methods adopt the encoder-decoder networks, which directly predict the markup sequences from formula images with the attention mechanism.
no code implementations • 23 Mar 2022 • Wondimu Dikubab, Dingkang Liang, Minghui Liao, Xiang Bai
Ethiopic/Amharic script is one of the oldest African writing systems, which serves at least 23 languages (e. g., Amharic, Tigrinya) in East Africa for more than 120 million people.
1 code implementation • 26 Feb 2022 • Dingkang Liang, Wei Xu, Xiang Bai
Crowd localization, predicting head positions, is a more practical and high-level task than simply counting.
no code implementations • 3 Sep 2021 • Xinwei He, Silin Cheng, Dingkang Liang, Song Bai, Xi Wang, Yingying Zhu
To investigate this, we propose a novel Locality-Aware Point-View Fusion Transformer (LATFormer) for 3D shape retrieval and classification.
1 code implementation • 19 Jul 2021 • Dawei Du, Longyin Wen, Pengfei Zhu, Heng Fan, QinGhua Hu, Haibin Ling, Mubarak Shah, Junwen Pan, Ali Al-Ali, Amr Mohamed, Bakour Imene, Bin Dong, Binyu Zhang, Bouchali Hadia Nesma, Chenfeng Xu, Chenzhen Duan, Ciro Castiello, Corrado Mencar, Dingkang Liang, Florian Krüger, Gennaro Vessio, Giovanna Castellano, Jieru Wang, Junyu Gao, Khalid Abualsaud, Laihui Ding, Lei Zhao, Marco Cianciotta, Muhammad Saqib, Noor Almaadeed, Omar Elharrouss, Pei Lyu, Qi Wang, Shidong Liu, Shuang Qiu, Siyang Pan, Somaya Al-Maadeed, Sultan Daud Khan, Tamer Khattab, Tao Han, Thomas Golda, Wei Xu, Xiang Bai, Xiaoqing Xu, Xuelong Li, Yanyun Zhao, Ye Tian, Yingnan Lin, Yongchao Xu, Yuehan Yao, Zhenyu Xu, Zhijian Zhao, Zhipeng Luo, Zhiwei Wei, Zhiyuan Zhao
Crowd counting on the drone platform is an interesting topic in computer vision, which brings new challenges such as small object inference, background clutter and wide viewpoint.
1 code implementation • 19 Apr 2021 • Dingkang Liang, Xiwu Chen, Wei Xu, Yu Zhou, Xiang Bai
Current weakly-supervised counting methods adopt the CNN to regress a total count of the crowd by an image-to-count paradigm.
3 code implementations • 16 Feb 2021 • Dingkang Liang, Wei Xu, Yingying Zhu, Yu Zhou
Most regression-based methods utilize convolution neural networks (CNN) to regress a density map, which can not accurately locate the instance in the extremely dense scene, attributed to two crucial reasons: 1) the density map consists of a series of blurry Gaussian blobs, 2) severe overlaps exist in the dense region of the density map.
no code implementations • 15 Dec 2020 • Wei Xu, Dingkang Liang, Yixiao Zheng, Zhanyu Ma
In this paper, we propose a simple yet efficient counting network based on point-level annotations.
2 code implementations • 20 Dec 2019 • Chenfeng Xu, Dingkang Liang, Yongchao Xu, Song Bai, Wei Zhan, Xiang Bai, Masayoshi Tomizuka
A major issue is that the density map on dense regions usually accumulates density values from a number of nearby Gaussian blobs, yielding different large density values on a small set of pixels.