Search Results for author: Zhikai Li

Found 19 papers, 8 papers with code

TTAQ: Towards Stable Post-training Quantization in Continuous Domain Adaptation

no code implementations13 Dec 2024 Junrui Xiao, Zhikai Li, Lianwei Yang, Yiduo Mei, Qingyi Gu

Post-training quantization (PTQ) reduces excessive hardware cost by quantizing full-precision models into lower bit representations on a tiny calibration set, without retraining.

Quantization Test-time Adaptation

A Stitch in Time Saves Nine: Small VLM is a Precise Guidance for Accelerating Large VLMs

1 code implementation4 Dec 2024 Wangbo Zhao, Yizeng Han, Jiasheng Tang, Zhikai Li, Yibing Song, Kai Wang, Zhangyang Wang, Yang You

Vision-language models (VLMs) have shown remarkable success across various multi-modal tasks, yet large VLMs encounter significant efficiency challenges due to processing numerous visual tokens.

Visual Question Answering

Privacy-Preserving SAM Quantization for Efficient Edge Intelligence in Healthcare

no code implementations14 Sep 2024 Zhikai Li, Jing Zhang, Qingyi Gu

In this paper, we propose a data-free quantization framework for SAM, called DFQ-SAM, which learns and calibrates quantization parameters without any original data, thus effectively preserving data privacy during model compression.

Data Free Quantization Image Segmentation +3

MGRQ: Post-Training Quantization For Vision Transformer With Mixed Granularity Reconstruction

no code implementations13 Jun 2024 Lianwei Yang, Zhikai Li, Junrui Xiao, Haisong Gong, Qingyi Gu

Extra-Block Global Supervision considers the relationship between block outputs and the model's output, aiding block-wise reconstruction through global supervision.

Quantization

ML2SC: Deploying Machine Learning Models as Smart Contracts on the Blockchain

no code implementations28 Mar 2024 Zhikai Li, Steve Vott, Bhaskar Krishnamachar

Finally, the model inference can also be done with a function call providing the input.

Math

LLM Inference Unveiled: Survey and Roofline Model Insights

2 code implementations26 Feb 2024 Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen Dong, Zhe Zhou, Chenhao Xue, Bingzhe Wu, Zhikai Li, Qingyi Gu, Yong Jae Lee, Yan Yan, Beidi Chen, Guangyu Sun, Kurt Keutzer

Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model for systematic analysis of LLM inference techniques.

Knowledge Distillation Language Modelling +5

RepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterization

no code implementations8 Feb 2024 Zhikai Li, Xuewen Liu, Jing Zhang, Qingyi Gu

In particular, for the former, we introduce a learnable per-channel dual clipping scheme, which is designed to efficiently identify outliers in the unbalanced activations with fine granularity.

Quantization

An Improved Grey Wolf Optimization Algorithm for Heart Disease Prediction

no code implementations22 Jan 2024 Sihan Niu, Yifan Zhou, Zhikai Li, Shuyao Huang, Yujun Zhou

This paper presents a unique solution to challenges in medical image processing by incorporating an adaptive curve grey wolf optimization (ACGWO) algorithm into neural network backpropagation.

Disease Prediction Diversity

RTA-Former: Reverse Transformer Attention for Polyp Segmentation

1 code implementation22 Jan 2024 Zhikai Li, Murong Yi, Ali Uneri, Sihan Niu, Craig Jones

Polyp segmentation is a key aspect of colorectal cancer prevention, enabling early detection and guiding subsequent treatments.

Decoder Segmentation

EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models

1 code implementation9 Jan 2024 Xuewen Liu, Zhikai Li, Junrui Xiao, Qingyi Gu

Unfortunately, we find that due to the highly dynamic distribution of activations in different denoising steps, existing PTQ methods for diffusion models suffer from distribution mismatch issues at both calibration sample level and reconstruction output level, which makes the performance far from satisfactory, especially in low-bit cases.

Denoising Image Generation +2

QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources

no code implementations11 Oct 2023 Zhikai Li, Xiaoxuan Liu, Banghua Zhu, Zhen Dong, Qingyi Gu, Kurt Keutzer

Large Language Models (LLMs) have showcased remarkable impacts across a wide spectrum of natural language processing tasks.

parameter-efficient fine-tuning Quantization

BinaryViT: Towards Efficient and Accurate Binary Vision Transformers

no code implementations24 May 2023 Junrui Xiao, Zhikai Li, Lianwei Yang, Qingyi Gu

In this paper, we first argue empirically that the severe performance degradation is mainly caused by the weight oscillation in the binarization training and the information distortion in the activation of ViTs.

Binarization Quantization

Patch-wise Mixed-Precision Quantization of Vision Transformer

no code implementations11 May 2023 Junrui Xiao, Zhikai Li, Lianwei Yang, Qingyi Gu

As emerging hardware begins to support mixed bit-width arithmetic computation, mixed-precision quantization is widely used to reduce the complexity of neural networks.

Quantization

RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers

1 code implementation ICCV 2023 Zhikai Li, Junrui Xiao, Lianwei Yang, Qingyi Gu

Post-training quantization (PTQ), which only requires a tiny dataset for calibration without end-to-end retraining, is a light and practical model compression technique.

Model Compression Quantization

PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers

1 code implementation13 Sep 2022 Zhikai Li, Mengjuan Chen, Junrui Xiao, Qingyi Gu

In this paper, we propose PSAQ-ViT V2, a more accurate and general data-free quantization framework for ViTs, built on top of PSAQ-ViT.

Data Free Quantization Image Classification +4

I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference

1 code implementation ICCV 2023 Zhikai Li, Qingyi Gu

In this paper, we propose I-ViT, an integer-only quantization scheme for ViTs, to enable ViTs to perform the entire computational graph of inference with integer arithmetic and bit-shifting, and without any floating-point arithmetic.

Quantization

Patch Similarity Aware Data-Free Quantization for Vision Transformers

1 code implementation4 Mar 2022 Zhikai Li, Liping Ma, Mengjuan Chen, Junrui Xiao, Qingyi Gu

The above insights guide us to design a relative value metric to optimize the Gaussian noise to approximate the real images, which are then utilized to calibrate the quantization parameters.

Data Free Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.