Search Results for author: Dingkang Liang

Found 23 papers, 15 papers with code

Anomaly Detection by Adapting a pre-trained Vision Language Model

no code implementations14 Mar 2024 Yuxuan Cai, Xinwei He, Dingkang Liang, Ao Tong, Xiang Bai

Recently, large vision and language models have shown their success when adapting them to many downstream tasks.

Anomaly Detection Language Modelling +1

Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis

1 code implementation3 Mar 2024 Xin Zhou, Dingkang Liang, Wei Xu, Xingkui Zhu, Yihan Xu, Zhikang Zou, Xiang Bai

To achieve this goal, we freeze the parameters of the default pre-trained models and then propose the Dynamic Adapter, which generates a dynamic scale for each token, considering the token significance to the downstream task.

Transfer Learning

PointMamba: A Simple State Space Model for Point Cloud Analysis

1 code implementation16 Feb 2024 Dingkang Liang, Xin Zhou, Xinyu Wang, Xingkui Zhu, Wei Xu, Zhikang Zou, Xiaoqing Ye, Xiang Bai

Recently, state space models (SSM), a new family of deep sequence models, have presented great potential for sequence modeling in NLP tasks.

A Discrepancy Aware Framework for Robust Anomaly Detection

1 code implementation11 Oct 2023 Yuxuan Cai, Dingkang Liang, Dongliang Luo, Xinwei He, Xin Yang, Xiang Bai

To alleviate this issue, we present a Discrepancy Aware Framework (DAF), which demonstrates robust performance consistently with simple and cheap strategies across different anomaly detection benchmarks.

Anomaly Detection Defect Detection +2

SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model

1 code implementation4 Jun 2023 Dingyuan Zhang, Dingkang Liang, Hongcheng Yang, Zhikang Zou, Xiaoqing Ye, Zhe Liu, Xiang Bai

In the spirit of unleashing the capability of foundation models on vision tasks, the Segment Anything Model (SAM), a vision foundation model for image segmentation, has been proposed recently and presents strong zero-shot ability on many downstream 2D tasks.

3D Object Detection Image Segmentation +3

Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution

1 code implementation12 May 2023 Jianfeng Kuang, Wei Hua, Dingkang Liang, Mingkun Yang, Deqiang Jiang, Bo Ren, Xiang Bai

We evaluate the existing end-to-end methods for VIE on the proposed dataset and observe that the performance of these methods has a distinguishable drop from SROIE (a widely used English dataset) to our proposed dataset due to the larger variance of layout and entities.

Contrastive Learning Optical Character Recognition (OCR)

SOOD: Towards Semi-Supervised Oriented Object Detection

1 code implementation CVPR 2023 Wei Hua, Dingkang Liang, Jingyu Li, Xiaolong Liu, Zhikang Zou, Xiaoqing Ye, Xiang Bai

Semi-Supervised Object Detection (SSOD), aiming to explore unlabeled data for boosting object detectors, has become an active task in recent years.

Object object-detection +4

Super-Resolution Information Enhancement For Crowd Counting

1 code implementation13 Mar 2023 Jiahao Xie, Wei Xu, Dingkang Liang, Zhanyu Ma, Kongming Liang, Weidong Liu, Rui Wang, Ling Jin

As the proposed method requires SR labels, we further propose a Super-Resolution Crowd Counting dataset (SR-Crowd).

Crowd Counting Super-Resolution

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition

2 code implementations23 Jul 2022 Bohan Li, Ye Yuan, Dingkang Liang, Xiao Liu, Zhilong Ji, Jinfeng Bai, Wenyu Liu, Xiang Bai

Recently, most handwritten mathematical expression recognition (HMER) methods adopt the encoder-decoder networks, which directly predict the markup sequences from formula images with the attention mechanism.

Optical Character Recognition (OCR)

Comprehensive Benchmark Datasets for Amharic Scene Text Detection and Recognition

no code implementations23 Mar 2022 Wondimu Dikubab, Dingkang Liang, Minghui Liao, Xiang Bai

Ethiopic/Amharic script is one of the oldest African writing systems, which serves at least 23 languages (e. g., Amharic, Tigrinya) in East Africa for more than 120 million people.

Benchmarking Scene Text Detection +1

An End-to-End Transformer Model for Crowd Localization

1 code implementation26 Feb 2022 Dingkang Liang, Wei Xu, Xiang Bai

Crowd localization, predicting head positions, is a more practical and high-level task than simply counting.

LATFormer: Locality-Aware Point-View Fusion Transformer for 3D Shape Recognition

no code implementations3 Sep 2021 Xinwei He, Silin Cheng, Dingkang Liang, Song Bai, Xi Wang, Yingying Zhu

To investigate this, we propose a novel Locality-Aware Point-View Fusion Transformer (LATFormer) for 3D shape retrieval and classification.

3D Object Classification 3D Object Retrieval +3

TransCrowd: weakly-supervised crowd counting with transformers

1 code implementation19 Apr 2021 Dingkang Liang, Xiwu Chen, Wei Xu, Yu Zhou, Xiang Bai

Current weakly-supervised counting methods adopt the CNN to regress a total count of the crowd by an image-to-count paradigm.

Crowd Counting

Focal Inverse Distance Transform Maps for Crowd Localization

3 code implementations16 Feb 2021 Dingkang Liang, Wei Xu, Yingying Zhu, Yu Zhou

Most regression-based methods utilize convolution neural networks (CNN) to regress a density map, which can not accurately locate the instance in the extremely dense scene, attributed to two crucial reasons: 1) the density map consists of a series of blurry Gaussian blobs, 2) severe overlaps exist in the dense region of the density map.

Crowd Counting SSIM

Dilated-Scale-Aware Attention ConvNet For Multi-Class Object Counting

no code implementations15 Dec 2020 Wei Xu, Dingkang Liang, Yixiao Zheng, Zhanyu Ma

In this paper, we propose a simple yet efficient counting network based on point-level annotations.

Object Object Counting

AutoScale: Learning to Scale for Crowd Counting and Localization

2 code implementations20 Dec 2019 Chenfeng Xu, Dingkang Liang, Yongchao Xu, Song Bai, Wei Zhan, Xiang Bai, Masayoshi Tomizuka

A major issue is that the density map on dense regions usually accumulates density values from a number of nearby Gaussian blobs, yielding different large density values on a small set of pixels.

Crowd Counting Model Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.