Search Results for author: Haotong Qin

Found 51 papers, 34 papers with code

Low-bit Model Quantization for Deep Neural Networks: A Survey

no code implementations8 May 2025 Kai Liu, Qian Zheng, Kaiwen Tao, Zhiteng Li, Haotong Qin, Wenbo Li, Yong Guo, Xianglong Liu, Linghe Kong, Guihai Chen, Yulun Zhang, Xiaokang Yang

Therefore, it has become increasingly popular and critical to investigate how to perform the conversion and how to compensate for the information loss.

Quantization

RGB-Event Fusion with Self-Attention for Collision Prediction

1 code implementation7 May 2025 Pietro Bonazzi, Christian Vogt, Michael Jost, Haotong Qin, Lyes Khacef, Federico Paredes-Valles, Michele Magno

Notably, the event-based model outperforms the RGB model by 4% for position and 26% for time error at a similar computational cost, making it a competitive alternative.

Benchmarking Computational Efficiency +4

An Empirical Study of Qwen3 Quantization

1 code implementation4 May 2025 Xingyu Zheng, Yuye Li, Haoran Chu, Yue Feng, Xudong Ma, Jie Luo, Jinyang Guo, Haotong Qin, Michele Magno, Xianglong Liu

The Qwen series has emerged as a leading family of open-source Large Language Models (LLMs), demonstrating remarkable capabilities in natural language understanding tasks.

Natural Language Understanding Quantization

Enhancing Autonomous Driving Systems with On-Board Deployed Large Language Models

1 code implementation15 Apr 2025 Nicolas Baumann, Cheng Hu, Paviththiren Sivasothilingam, Haotong Qin, Lei Xie, Michele Magno, Luca Benini

Neural Networks (NNs) trained through supervised learning struggle with managing edge-case scenarios common in real-world driving due to the intractability of exhaustive datasets covering all edge-cases, making knowledge-driven approaches, akin to how humans intuitively detect unexpected driving behavior, a suitable complement to data-driven methods.

Autonomous Driving Computational Efficiency +3

Q-MambaIR: Accurate Quantized Mamba for Efficient Image Restoration

no code implementations27 Mar 2025 Yujie Chen, Haotong Qin, Zhang Zhang, Michelo Magno, Luca Benini, Yawei Li

While low-bit quantization is an efficient model compression strategy for reducing size and accelerating IR tasks, SSM suffers substantial performance drops at ultra-low bit-widths (2-4 bits), primarily due to outliers that exacerbate quantization error.

Computational Efficiency Image Restoration +4

TR-DQ: Time-Rotation Diffusion Quantization

no code implementations9 Mar 2025 Yihua Shao, Deyang Lin, Fanhu Zeng, Minxi Yan, Muyang Zhang, Siyu Chen, Yuxuan Fan, Ziyang Yan, Haozhe Wang, Jingcai Guo, Yan Wang, Haotong Qin, Hao Tang

TR-DQ achieves state-of-the-art (SOTA) performance on image generation and video generation tasks and a 1. 38-1. 89x speedup and 1. 97-2. 58x memory reduction in inference compared to existing quantization methods.

Image Generation Quantization +1

QArtSR: Quantization via Reverse-Module and Timestep-Retraining in One-Step Diffusion based Image Super-Resolution

1 code implementation7 Mar 2025 Libo Zhu, Haotong Qin, Kaicheng Yang, Wenbo Li, Yong Guo, Yulun Zhang, Susanto Rahardja, Xiaokang Yang

To explore more possibilities of quantized OSDSR, we propose an efficient method, Quantization via reverse-module and timestep-retraining for OSDSR, named QArtSR.

Denoising Image Super-Resolution +1

Q&C: When Quantization Meets Cache in Efficient Image Generation

no code implementations4 Mar 2025 Xin Ding, Xin Li, Haotong Qin, Zhibo Chen

In this work, we take advantage of these two acceleration mechanisms and propose a hybrid acceleration method by tackling the above challenges, aiming to further improve the efficiency of DiTs while maintaining excellent generation capability.

Image Generation Quantization

MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models

1 code implementation16 Dec 2024 Weilun Feng, Haotong Qin, Chuanguang Yang, Zhulin An, Libo Huang, Boyu Diao, Fei Wang, Renshuai Tao, Yongjun Xu, Michele Magno

However, the existing quantization methods for diffusion models still cause severe degradation in performance, especially under extremely low bit-widths (2-4 bit).

Quantization

DynamicPAE: Generating Scene-Aware Physical Adversarial Examples in Real-Time

no code implementations11 Dec 2024 Jin Hu, Xianglong Liu, Jiakai Wang, Junkai Zhang, Xianqi Yang, Haotong Qin, Yuqing Ma, Ke Xu

The key challenges in generating dynamic PAEs are exploring their patterns under noisy gradient feedback and adapting the attack to agnostic scenario natures.

BiDM: Pushing the Limit of Quantization for Diffusion Models

1 code implementation8 Dec 2024 Xingyu Zheng, Xianglong Liu, Yichen Bian, Xudong Ma, Yulun Zhang, Jiakai Wang, Jinyang Guo, Haotong Qin

Diffusion models (DMs) have been significantly developed and widely used in various applications due to their excellent generative qualities.

Binarization Image Generation +2

BiDense: Binarization for Dense Prediction

1 code implementation15 Nov 2024 Rui Yin, Haotong Qin, Yulun Zhang, Wenbo Li, Yong Guo, Jianjun Zhu, Cheng Wang, Biao Jia

BiDense incorporates two key techniques: the Distribution-adaptive Binarizer (DAB) and the Channel-adaptive Full-precision Bypass (CFB).

Binarization Prediction

ODDN: Addressing Unpaired Data Challenges in Open-World Deepfake Detection on Online Social Networks

1 code implementation24 Oct 2024 Renshuai Tao, Manyi Le, Chuangchuang Tan, Huan Liu, Haotong Qin, Yao Zhao

To overcome this issue, we propose a novel approach named the open-world deepfake detection network (ODDN), which comprises two core modules: open-world data aggregation (ODA) and compression-discard gradient correction (CGC).

DeepFake Detection Face Swapping

ARB-LLM: Alternating Refined Binarizations for Large Language Models

1 code implementation4 Oct 2024 Zhiteng Li, Xianglong Yan, Tianao Zhang, Haotong Qin, Dong Xie, Jiang Tian, Zhongchao shi, Linghe Kong, Yulun Zhang, Xiaokang Yang

However, current binarization methods struggle to narrow the distribution gap between binarized and full-precision weights, while also overlooking the column deviation in LLM weight distribution.

Binarization Quantization

A Survey of Low-bit Large Language Models: Basics, Systems, and Algorithms

no code implementations25 Sep 2024 Ruihao Gong, Yifu Ding, Zining Wang, Chengtao Lv, Xingyu Zheng, Jinyang Du, Haotong Qin, Jinyang Guo, Michele Magno, Xianglong Liu

Large language models (LLMs) have achieved remarkable advancements in natural language processing, showcasing exceptional performance across various tasks.

Quantization

2DQuant: Low-bit Post-Training Quantization for Image Super-Resolution

1 code implementation10 Jun 2024 Kai Liu, Haotong Qin, Yong Guo, Xin Yuan, Linghe Kong, Guihai Chen, Yulun Zhang

Low-bit quantization has become widespread for compressing image super-resolution (SR) models for edge deployment, which allows advanced SR models to enjoy compact low-bit parameters and efficient integer/bitwise constructions for storage compression and inference acceleration, respectively.

Image Super-Resolution Quantization

Binarized Diffusion Model for Image Super-Resolution

1 code implementation9 Jun 2024 Zheng Chen, Haotong Qin, Yong Guo, Xiongfei Su, Xin Yuan, Linghe Kong, Yulun Zhang

Nonetheless, due to the model structure and the multi-step iterative attribute of DMs, existing binarization methods result in significant performance degradation.

Attribute Binarization +3

SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models

1 code implementation23 May 2024 Wei Huang, Haotong Qin, Yangdong Liu, Yawei Li, Xianglong Liu, Luca Benini, Michele Magno, Xiaojuan Qi

Specifically, the proposed SliM-LLM mainly relies on two novel techniques: (1) Salience-Determined Bit Allocation utilizes the clustering characteristics of salience distribution to allocate the bit-widths of each group, increasing the accuracy of quantized LLMs and maintaining the inference efficiency; (2) Salience-Weighted Quantizer Calibration optimizes the parameters of the quantizer by considering the element-wise salience within the group, balancing the maintenance of salient information and minimization of errors.

Natural Language Understanding Quantization

An empirical study of LLaMA3 quantization: from LLMs to MLLMs

2 code implementations22 Apr 2024 Wei Huang, Xingyu Zheng, Xudong Ma, Haotong Qin, Chengtao Lv, Hong Chen, Jie Luo, Xiaojuan Qi, Xianglong Liu, Michele Magno

To uncover the capabilities of low-bit quantized MLLM, we assessed the performance of the LLaMA3-based LLaVA-Next-8B model under 2-4 ultra-low bits with post-training quantization methods.

Language Modelling Large Language Model +2

BinaryDM: Accurate Weight Binarization for Efficient Diffusion Models

1 code implementation8 Apr 2024 Xingyu Zheng, Xianglong Liu, Haotong Qin, Xudong Ma, Mingyuan Zhang, Haojie Hao, Jiakai Wang, Zixiang Zhao, Jinyang Guo, Michele Magno

From the optimization perspective, a Low-rank Representation Mimicking (LRM) is applied to assist the optimization of binarized DMs.

Binarization Quantization

Graph Construction with Flexible Nodes for Traffic Demand Prediction

no code implementations1 Mar 2024 Jinyan Hou, Shan Liu, Ya zhang, Haotong Qin

To tackle these challenges, this paper introduces a novel graph construction method tailored to free-floating traffic mode.

Clustering Computational Efficiency +3

DB-LLM: Accurate Dual-Binarization for Efficient LLMs

no code implementations19 Feb 2024 Hong Chen, Chengtao Lv, Liang Ding, Haotong Qin, Xiabin Zhou, Yifu Ding, Xuebo Liu, Min Zhang, Jinyang Guo, Xianglong Liu, DaCheng Tao

Large language models (LLMs) have significantly advanced the field of natural language processing, while the expensive memory and computation consumption impede their practical deployment.

Binarization Computational Efficiency +1

Accurate LoRA-Finetuning Quantization of LLMs via Information Retention

1 code implementation8 Feb 2024 Haotong Qin, Xudong Ma, Xingyu Zheng, Xiaoyang Li, Yang Zhang, Shouda Liu, Jie Luo, Xianglong Liu, Michele Magno

This paper proposes a novel IR-QLoRA for pushing quantized LLMs with LoRA to be highly accurate through information retention.

MMLU Quantization

BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

1 code implementation6 Feb 2024 Wei Huang, Yangdong Liu, Haotong Qin, Ying Li, Shiming Zhang, Xianglong Liu, Michele Magno, Xiaojuan Qi

Pretrained large language models (LLMs) exhibit exceptional general language processing capabilities but come with significant demands on memory and computational resources.

Binarization Quantization

Image Fusion via Vision-Language Model

3 code implementations3 Feb 2024 Zixiang Zhao, Lilun Deng, Haowen Bai, Yukun Cui, Zhipeng Zhang, Yulun Zhang, Haotong Qin, Dongdong Chen, Jiangshe Zhang, Peng Wang, Luc van Gool

Therefore, we introduce a novel fusion paradigm named image Fusion via vIsion-Language Model (FILM), for the first time, utilizing explicit textual information from source images to guide the fusion process.

Decoder Language Modeling +3

RdimKD: Generic Distillation Paradigm by Dimensionality Reduction

no code implementations14 Dec 2023 Yi Guo, Yiqian He, Xiaoyang Li, Haotong Qin, Van Tung Pham, Yang Zhang, Shouda Liu

Knowledge Distillation (KD) emerges as one of the most promising compression technologies to run advanced deep neural networks on resource-limited devices.

Dimensionality Reduction Knowledge Distillation

BinaryHPE: 3D Human Pose and Shape Estimation via Binarization

1 code implementation24 Nov 2023 Zhiteng Li, Yulun Zhang, Jing Lin, Haotong Qin, Jinjin Gu, Xin Yuan, Linghe Kong, Xiaokang Yang

In this work, we propose BinaryHPE, a novel binarization method designed to estimate the 3D human body, face, and hands parameters efficiently.

3D human pose and shape estimation Binarization +2

On-Chip Hardware-Aware Quantization for Mixed Precision Neural Networks

no code implementations5 Sep 2023 Wei Huang, Haotong Qin, Yangdong Liu, Jingzhuo Liang, Yulun Zhang, Ying Li, Xianglong Liu

This leads to a non-negligible gap between the estimated efficiency metrics and the actual hardware that makes quantized models far away from the optimal accuracy and efficiency, and also causes the quantization process to rely on additional high-performance devices.

Quantization

RobustMQ: Benchmarking Robustness of Quantized Models

no code implementations4 Aug 2023 Yisong Xiao, Aishan Liu, Tianyuan Zhang, Haotong Qin, Jinyang Guo, Xianglong Liu

Quantization has emerged as an essential technique for deploying deep neural networks (DNNs) on devices with limited resources.

Adversarial Robustness Benchmarking +1

How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges

1 code implementation27 Jul 2023 Haotong Qin, Ge-Peng Ji, Salman Khan, Deng-Ping Fan, Fahad Shahbaz Khan, Luc van Gool

Google's Bard has emerged as a formidable competitor to OpenAI's ChatGPT in the field of conversational AI.

Benchmarking the Robustness of Quantized Models

no code implementations8 Apr 2023 Yisong Xiao, Tianyuan Zhang, Shunchang Liu, Haotong Qin

To address this gap, we thoroughly evaluated the robustness of quantized models against various noises (adversarial attacks, natural corruptions, and systematic noises) on ImageNet.

Benchmarking Quantization

Towards Accurate Post-Training Quantization for Vision Transformer

no code implementations25 Mar 2023 Yifu Ding, Haotong Qin, Qinghua Yan, Zhenhua Chai, Junjie Liu, Xiaolin Wei, Xianglong Liu

We find the main reasons lie in (1) the existing calibration metric is inaccurate in measuring the quantization influence for extremely low-bit representation, and (2) the existing quantization paradigm is unfriendly to the power-law distribution of Softmax.

Model Compression Quantization

BiBench: Benchmarking and Analyzing Network Binarization

1 code implementation26 Jan 2023 Haotong Qin, Mingyuan Zhang, Yifu Ding, Aoyu Li, Zhongang Cai, Ziwei Liu, Fisher Yu, Xianglong Liu

Network binarization emerges as one of the most promising compression approaches offering extraordinary computation and memory savings by minimizing the bit-width.

Benchmarking Binarization

BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance

1 code implementation13 Nov 2022 Haotong Qin, Xudong Ma, Yifu Ding, Xiaoyang Li, Yang Zhang, Zejun Ma, Jiakai Wang, Jie Luo, Xianglong Liu

We highlight that benefiting from the compact architecture and optimized hardware kernel, BiFSMNv2 can achieve an impressive 25. 1x speedup and 20. 2x storage-saving on edge hardware.

Binarization Keyword Spotting

Defensive Patches for Robust Recognition in the Physical World

1 code implementation CVPR 2022 Jiakai Wang, Zixin Yin, Pengfei Hu, Aishan Liu, Renshuai Tao, Haotong Qin, Xianglong Liu, DaCheng Tao

For the generalization against diverse noises, we inject class-specific identifiable patterns into a confined local patch prior, so that defensive patches could preserve more recognizable features towards specific classes, leading models for better recognition under noises.

BiBERT: Accurate Fully Binarized BERT

1 code implementation ICLR 2022 Haotong Qin, Yifu Ding, Mingyuan Zhang, Qinghua Yan, Aishan Liu, Qingqing Dang, Ziwei Liu, Xianglong Liu

The large pre-trained BERT has achieved remarkable performance on Natural Language Processing (NLP) tasks but is also computation and memory expensive.

Binarization

BiFSMN: Binary Neural Network for Keyword Spotting

1 code implementation14 Feb 2022 Haotong Qin, Xudong Ma, Yifu Ding, Xiaoyang Li, Yang Zhang, Yao Tian, Zejun Ma, Jie Luo, Xianglong Liu

Then, to allow the instant and adaptive accuracy-efficiency trade-offs at runtime, we also propose a Thinnable Binarization Architecture to further liberate the acceleration potential of the binarized network from the topology perspective.

Binarization Keyword Spotting

Hardware-friendly Deep Learning by Network Quantization and Binarization

no code implementations1 Dec 2021 Haotong Qin

Quantization is emerging as an efficient approach to promote hardware-friendly deep learning and run deep neural networks on resource-limited hardware.

Binarization Deep Learning +1

Distribution-sensitive Information Retention for Accurate Binary Neural Network

no code implementations25 Sep 2021 Haotong Qin, Xiangguo Zhang, Ruihao Gong, Yifu Ding, Yi Xu, Xianglong Liu

We present a novel Distribution-sensitive Information Retention Network (DIR-Net) that retains the information in the forward and backward propagation by improving internal propagation and introducing external representations.

Binarization Image Classification +1

Diverse Sample Generation: Pushing the Limit of Generative Data-free Quantization

1 code implementation1 Sep 2021 Haotong Qin, Yifu Ding, Xiangguo Zhang, Jiakai Wang, Xianglong Liu, Jiwen Lu

We first give a theoretical analysis that the diversity of synthetic samples is crucial for the data-free quantization, while in existing approaches, the synthetic data completely constrained by BN statistics experimentally exhibit severe homogenization at distribution and sample levels.

Data Free Quantization Image Classification

Towards Real-world X-ray Security Inspection: A High-Quality Benchmark and Lateral Inhibition Module for Prohibited Items Detection

1 code implementation ICCV 2021 Renshuai Tao, Yanlu Wei, Xiangjian Jiang, Hainan Li, Haotong Qin, Jiakai Wang, Yuqing Ma, Libo Zhang, Xianglong Liu

In this work, we first present a High-quality X-ray (HiXray) security inspection image dataset, which contains 102, 928 common prohibited items of 8 categories.

Diversifying Sample Generation for Accurate Data-Free Quantization

no code implementations CVPR 2021 Xiangguo Zhang, Haotong Qin, Yifu Ding, Ruihao Gong, Qinghua Yan, Renshuai Tao, Yuhang Li, Fengwei Yu, Xianglong Liu

Unfortunately, we find that in practice, the synthetic data identically constrained by BN statistics suffers serious homogenization at both distribution level and sample level and further causes a significant performance drop of the quantized model.

Data Free Quantization Image Classification

Over-sampling De-occlusion Attention Network for Prohibited Items Detection in Noisy X-ray Images

1 code implementation1 Mar 2021 Renshuai Tao, Yanlu Wei, Hainan Li, Aishan Liu, Yifu Ding, Haotong Qin, Xianglong Liu

The images are gathered from an airport and these prohibited items are annotated manually by professional inspectors, which can be used as a benchmark for model training and further facilitate future research.

object-detection Object Detection

Towards Defending Multiple $\ell_p$-norm Bounded Adversarial Perturbations via Gated Batch Normalization

1 code implementation3 Dec 2020 Aishan Liu, Shiyu Tang, Xinyun Chen, Lei Huang, Haotong Qin, Xianglong Liu, DaCheng Tao

In this paper, we observe that different $\ell_p$ bounded adversarial perturbations induce different statistical properties that can be separated and characterized by the statistics of Batch Normalization (BN).

BiPointNet: Binary Neural Network for Point Clouds

1 code implementation ICLR 2021 Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Liu, Hao Su

To alleviate the resource constraint for real-time point cloud applications that run on edge devices, in this paper we present BiPointNet, the first model binarization approach for efficient deep learning on point clouds.

Binarization

Binary Neural Networks: A Survey

2 code implementations31 Mar 2020 Haotong Qin, Ruihao Gong, Xianglong Liu, Xiao Bai, Jingkuan Song, Nicu Sebe

The binary neural network, largely saving the storage and computation, serves as a promising technique for deploying deep models on resource-limited devices.

Binarization Image Classification +5

Forward and Backward Information Retention for Accurate Binary Neural Networks

2 code implementations CVPR 2020 Haotong Qin, Ruihao Gong, Xianglong Liu, Mingzhu Shen, Ziran Wei, Fengwei Yu, Jingkuan Song

Our empirical study indicates that the quantization brings information loss in both forward and backward propagation, which is the bottleneck of training accurate binary neural networks.

Binarization Neural Network Compression +1

Cannot find the paper you are looking for? You can Submit a new open access paper.