Search Results for author: Dingkang Liang

Found 23 papers, 15 papers with code

Anomaly Detection by Adapting a pre-trained Vision Language Model

no code implementations • 14 Mar 2024 • Yuxuan Cai, Xinwei He, Dingkang Liang, Ao Tong, Xiang Bai

Recently, large vision and language models have shown their success when adapting them to many downstream tasks.

Paper
Add Code

Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis

1 code implementation • 3 Mar 2024 • Xin Zhou, Dingkang Liang, Wei Xu, Xingkui Zhu, Yihan Xu, Zhikang Zou, Xiang Bai

To achieve this goal, we freeze the parameters of the default pre-trained models and then propose the Dynamic Adapter, which generates a dynamic scale for each token, considering the token significance to the downstream task.

Transfer Learning

147

Paper
Code

AVS-Net: Point Sampling with Adaptive Voxel Size for 3D Scene Understanding

no code implementations • 27 Feb 2024 • Hongcheng Yang, Dingkang Liang, Dingyuan Zhang, Zhe Liu, Zhikang Zou, Xingyu Jiang, Yingying Zhu

For such purpose, this paper presents an advanced sampler that achieves both high accuracy and efficiency.

3D Object Detection 3D Semantic Segmentation +2

Paper
Add Code

PointMamba: A Simple State Space Model for Point Cloud Analysis

1 code implementation • 16 Feb 2024 • Dingkang Liang, Xin Zhou, Xinyu Wang, Xingkui Zhu, Wei Xu, Zhikang Zou, Xiaoqing Ye, Xiang Bai

Recently, state space models (SSM), a new family of deep sequence models, have presented great potential for sequence modeling in NLP tasks.

243

Paper
Code

You Only Look Bottom-Up for Monocular 3D Object Detection

no code implementations • 27 Jan 2024 • Kaixin Xiong, Dingyuan Zhang, Dingkang Liang, Zhe Liu, Hongcheng Yang, Wondimu Dikubab, Jianwei Cheng, Xiang Bai

Monocular 3D Object Detection is an essential task for autonomous driving.

Autonomous Driving Monocular 3D Object Detection +2

Paper
Add Code

A Discrepancy Aware Framework for Robust Anomaly Detection

1 code implementation • 11 Oct 2023 • Yuxuan Cai, Dingkang Liang, Dongliang Luo, Xinwei He, Xin Yang, Xiang Bai

To alleviate this issue, we present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies across different anomaly detection benchmarks.

Anomaly Detection Defect Detection +2

Paper
Code

Diffusion-based 3D Object Detection with Random Boxes

no code implementations • 5 Sep 2023 • Xin Zhou, Jinghua Hou, Tingting Yao, Dingkang Liang, Zhe Liu, Zhikang Zou, Xiaoqing Ye, Jianwei Cheng, Xiang Bai

3D object detection is an essential task for achieving autonomous driving.

3D Object Detection Autonomous Driving +2

Paper
Add Code

SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model

1 code implementation • 4 Jun 2023 • Dingyuan Zhang, Dingkang Liang, Hongcheng Yang, Zhikang Zou, Xiaoqing Ye, Zhe Liu, Xiang Bai

In the spirit of unleashing the capability of foundation models on vision tasks, the Segment Anything Model (SAM), a vision foundation model for image segmentation, has been proposed recently and presents strong zero-shot ability on many downstream 2D tasks.

3D Object Detection Image Segmentation +3

187

Paper
Code

Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution

1 code implementation • 12 May 2023 • Jianfeng Kuang, Wei Hua, Dingkang Liang, Mingkun Yang, Deqiang Jiang, Bo Ren, Xiang Bai

We evaluate the existing end-to-end methods for VIE on the proposed dataset and observe that the performance of these methods has a distinguishable drop from SROIE (a widely used English dataset) to our proposed dataset due to the larger variance of layout and entities.

Contrastive Learning Optical Character Recognition (OCR)

Paper
Code

SOOD: Towards Semi-Supervised Oriented Object Detection

1 code implementation • CVPR 2023 • Wei Hua, Dingkang Liang, Jingyu Li, Xiaolong Liu, Zhikang Zou, Xiaoqing Ye, Xiang Bai

Semi-Supervised Object Detection (SSOD), aiming to explore unlabeled data for boosting object detectors, has become an active task in recent years.

Object object-detection +4

Paper
Code

CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model

2 code implementations • CVPR 2023 • Dingkang Liang, Jiahao Xie, Zhikang Zou, Xiaoqing Ye, Wei Xu, Xiang Bai

To the best of our knowledge, CrowdCLIP is the first to investigate the vision language knowledge to solve the counting problem.

Ranked #1 on Cross-Part Crowd Counting on ShanghaiTech B

Cross-Part Crowd Counting Crowd Counting +1

Paper
Code

Super-Resolution Information Enhancement For Crowd Counting

1 code implementation • 13 Mar 2023 • Jiahao Xie, Wei Xu, Dingkang Liang, Zhanyu Ma, Kongming Liang, Weidong Liu, Rui Wang, Ling Jin

As the proposed method requires SR labels, we further propose a Super-Resolution Crowd Counting dataset (SR-Crowd).

Crowd Counting Super-Resolution

Paper
Code

DDS3D: Dense Pseudo-Labels with Dynamic Threshold for Semi-Supervised 3D Object Detection

1 code implementation • 9 Mar 2023 • Jingyu Li, Zhe Liu, Jinghua Hou, Dingkang Liang

In this paper, we present a simple yet effective semi-supervised 3D object detector named DDS3D.

3D Object Detection object-detection +1

Paper
Code

A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection

no code implementations • ICCV 2023 • Dingyuan Zhang, Dingkang Liang, Zhikang Zou, Jingyu Li, Xiaoqing Ye, Zhe Liu, Xiao Tan, Xiang Bai

Advanced 3D object detection methods usually rely on large-scale, elaborately labeled datasets to achieve good performance.

3D Object Detection Object +1

Paper
Add Code

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition

2 code implementations • 23 Jul 2022 • Bohan Li, Ye Yuan, Dingkang Liang, Xiao Liu, Zhilong Ji, Jinfeng Bai, Wenyu Liu, Xiang Bai

Recently, most handwritten mathematical expression recognition (HMER) methods adopt the encoder-decoder networks, which directly predict the markup sequences from formula images with the attention mechanism.

Optical Character Recognition (OCR)

343

Paper
Code

Comprehensive Benchmark Datasets for Amharic Scene Text Detection and Recognition

no code implementations • 23 Mar 2022 • Wondimu Dikubab, Dingkang Liang, Minghui Liao, Xiang Bai

Ethiopic/Amharic script is one of the oldest African writing systems, which serves at least 23 languages (e. g., Amharic, Tigrinya) in East Africa for more than 120 million people.

Benchmarking Scene Text Detection +1

Paper
Add Code

An End-to-End Transformer Model for Crowd Localization

1 code implementation • 26 Feb 2022 • Dingkang Liang, Wei Xu, Xiang Bai

Crowd localization, predicting head positions, is a more practical and high-level task than simply counting.

Paper
Code

LATFormer: Locality-Aware Point-View Fusion Transformer for 3D Shape Recognition

no code implementations • 3 Sep 2021 • Xinwei He, Silin Cheng, Dingkang Liang, Song Bai, Xi Wang, Yingying Zhu

To investigate this, we propose a novel Locality-Aware Point-View Fusion Transformer (LATFormer) for 3D shape retrieval and classification.

3D Object Classification 3D Object Retrieval +3

Paper
Add Code

VisDrone-CC2020: The Vision Meets Drone Crowd Counting Challenge Results

1 code implementation • 19 Jul 2021 • Dawei Du, Longyin Wen, Pengfei Zhu, Heng Fan, QinGhua Hu, Haibin Ling, Mubarak Shah, Junwen Pan, Ali Al-Ali, Amr Mohamed, Bakour Imene, Bin Dong, Binyu Zhang, Bouchali Hadia Nesma, Chenfeng Xu, Chenzhen Duan, Ciro Castiello, Corrado Mencar, Dingkang Liang, Florian Krüger, Gennaro Vessio, Giovanna Castellano, Jieru Wang, Junyu Gao, Khalid Abualsaud, Laihui Ding, Lei Zhao, Marco Cianciotta, Muhammad Saqib, Noor Almaadeed, Omar Elharrouss, Pei Lyu, Qi Wang, Shidong Liu, Shuang Qiu, Siyang Pan, Somaya Al-Maadeed, Sultan Daud Khan, Tamer Khattab, Tao Han, Thomas Golda, Wei Xu, Xiang Bai, Xiaoqing Xu, Xuelong Li, Yanyun Zhao, Ye Tian, Yingnan Lin, Yongchao Xu, Yuehan Yao, Zhenyu Xu, Zhijian Zhao, Zhipeng Luo, Zhiwei Wei, Zhiyuan Zhao

Crowd counting on the drone platform is an interesting topic in computer vision, which brings new challenges such as small object inference, background clutter and wide viewpoint.

Crowd Counting

Paper
Code

TransCrowd: weakly-supervised crowd counting with transformers

1 code implementation • 19 Apr 2021 • Dingkang Liang, Xiwu Chen, Wei Xu, Yu Zhou, Xiang Bai

Current weakly-supervised counting methods adopt the CNN to regress a total count of the crowd by an image-to-count paradigm.

Crowd Counting

Paper
Code

Focal Inverse Distance Transform Maps for Crowd Localization

3 code implementations • 16 Feb 2021 • Dingkang Liang, Wei Xu, Yingying Zhu, Yu Zhou

Most regression-based methods utilize convolution neural networks (CNN) to regress a density map, which can not accurately locate the instance in the extremely dense scene, attributed to two crucial reasons: 1) the density map consists of a series of blurry Gaussian blobs, 2) severe overlaps exist in the dense region of the density map.

Crowd Counting SSIM

163

Paper
Code

Dilated-Scale-Aware Attention ConvNet For Multi-Class Object Counting

no code implementations • 15 Dec 2020 • Wei Xu, Dingkang Liang, Yixiao Zheng, Zhanyu Ma

In this paper, we propose a simple yet efficient counting network based on point-level annotations.

Object Object Counting

Paper
Add Code

AutoScale: Learning to Scale for Crowd Counting and Localization

2 code implementations • 20 Dec 2019 • Chenfeng Xu, Dingkang Liang, Yongchao Xu, Song Bai, Wei Zhan, Xiang Bai, Masayoshi Tomizuka

A major issue is that the density map on dense regions usually accumulates density values from a number of nearby Gaussian blobs, yielding different large density values on a small set of pixels.

Crowd Counting Model Optimization

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.