Search Results for author: Yuzhang Shang

Found 18 papers, 11 papers with code

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models

no code implementations • 22 Mar 2024 • Yuzhang Shang, Mu Cai, Bingxin Xu, Yong Jae Lee, Yan Yan

Based on this, we propose PruMerge, a novel adaptive visual token reduction approach, which largely reduces the number of visual tokens while maintaining comparable model performance.

Language Modelling Large Language Model +3

Paper
Add Code

FBPT: A Fully Binary Point Transformer

no code implementations • 15 Mar 2024 • Zhixing Hou, Yuzhang Shang, Yan Yan

This paper presents a novel Fully Binary Point Cloud Transformer (FBPT) model which has the potential to be widely applied and expanded in the fields of robotics and mobile devices.

Binarization Point Cloud Classification

Paper
Add Code

Online Multi-spectral Neuron Tracing

no code implementations • 10 Mar 2024 • Bin Duan, Yuzhang Shang, Dawen Cai, Yan Yan

In this paper, we propose an online multi-spectral neuron tracing method with uniquely designed modules, where no offline training are required.

Paper
Add Code

LLM Inference Unveiled: Survey and Roofline Model Insights

2 code implementations • 26 Feb 2024 • Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen Dong, Zhe Zhou, Chenhao Xue, Bingzhe Wu, Zhikai Li, Qingyi Gu, Yong Jae Lee, Yan Yan, Beidi Chen, Guangyu Sun, Kurt Keutzer

Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model for systematic analysis of LLM inference techniques.

Knowledge Distillation Language Modelling +3

148

Paper
Code

QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning

1 code implementation • 6 Feb 2024 • Haoxuan Wang, Yuzhang Shang, Zhihang Yuan, Junyi Wu, Yan Yan

Diffusion models have achieved remarkable success in image generation tasks, yet their practical deployment is restrained by the high memory and time consumption.

Image Generation Model Compression +1

Paper
Code

MIM4DD: Mutual Information Maximization for Dataset Distillation

1 code implementation • NeurIPS 2023 • Yuzhang Shang, Zhihang Yuan, Yan Yan

Thus, we introduce mutual information (MI) as the metric to quantify the shared information between the synthetic and the real datasets, and devise MIM4DD numerically maximizing the MI via a newly designed optimizable objective within a contrastive learning framework to update the synthetic dataset.

Contrastive Learning

1,153

Paper
Code

ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models

1 code implementation • 10 Dec 2023 • Zhihang Yuan, Yuzhang Shang, Yue Song, Qiang Wu, Yan Yan, Guangyu Sun

This paper explores a new post-hoc training-free compression paradigm for compressing Large Language Models (LLMs) to facilitate their wider adoption in various computing environments.

Paper
Code

PB-LLM: Partially Binarized Large Language Models

2 code implementations • 29 Sep 2023 • Yuzhang Shang, Zhihang Yuan, Qiang Wu, Zhen Dong

This paper explores network binarization, a radical form of quantization, compressing model weights to a single bit, specifically for Large Language Models (LLMs) compression.

Binarization Quantization

134

Paper
Code

Causal-DFQ: Causality Guided Data-free Network Quantization

1 code implementation • ICCV 2023 • Yuzhang Shang, Bingxin Xu, Gaowen Liu, Ramana Kompella, Yan Yan

Inspired by the causal understanding, we propose the Causality-guided Data-free Network Quantization method, Causal-DFQ, to eliminate the reliance on data via approaching an equilibrium of causality-driven intervened distributions.

Data Free Quantization Neural Network Compression

Paper
Code

RPTQ: Reorder-based Post-training Quantization for Large Language Models

1 code implementation • 3 Apr 2023 • Zhihang Yuan, Lin Niu, Jiawei Liu, Wenyu Liu, Xinggang Wang, Yuzhang Shang, Guangyu Sun, Qiang Wu, Jiaxiang Wu, Bingzhe Wu

In this paper, we identify that the challenge in quantizing activations in LLMs arises from varying ranges across channels, rather than solely the presence of outliers.

Quantization

172

Paper
Code

BPT: Binary Point Cloud Transformer for Place Recognition

no code implementations • 2 Mar 2023 • Zhixing Hou, Yuzhang Shang, Tian Gao, Yan Yan

To solve this issue, we propose a binary point cloud transformer for place recognition.

Paper
Add Code

Post-training Quantization on Diffusion Models

1 code implementation • CVPR 2023 • Yuzhang Shang, Zhihang Yuan, Bin Xie, Bingzhe Wu, Yan Yan

These approaches define a forward diffusion process for transforming data into noise and a backward denoising process for sampling data from noise.

Denoising Noise Estimation +1

100

Paper
Code

Lipschitz Continuity Retained Binary Neural Network

1 code implementation • 13 Jul 2022 • Yuzhang Shang, Dan Xu, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

Relying on the premise that the performance of a binary neural network can be largely restored with eliminated quantization error between full-precision weight vectors and their corresponding binary vectors, existing works of network binarization frequently adopt the idea of model robustness to reach the aforementioned objective.

Binarization Quantization

Paper
Code

Network Binarization via Contrastive Learning

1 code implementation • 6 Jul 2022 • Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan

Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.

Binarization Contrastive Learning +2

Paper
Code

Supplementing Missing Visions via Dialog for Scene Graph Generations

1 code implementation • 23 Apr 2022 • Zhenghao Zhao, Ye Zhu, Xiaoguang Zhu, Yuzhang Shang, Yan Yan

Most current AI systems rely on the premise that the input visual data are sufficient to achieve competitive performance in various computer vision tasks.

Graph Generation Scene Graph Generation

Paper
Code

Win the Lottery Ticket via Fourier Analysis: Frequencies Guided Network Pruning

no code implementations • 30 Jan 2022 • Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

Extensive experiments on CIFAR-10 and CIFAR-100 demonstrate the superiority of our novel Fourier analysis based MBP compared to other traditional MBP algorithms.

Knowledge Distillation Network Pruning

Paper
Add Code

Contrastive Mutual Information Maximization for Binary Neural Networks

no code implementations • 29 Sep 2021 • Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan

Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.

Binarization Contrastive Learning +2

Paper
Add Code

Lipschitz Continuity Guided Knowledge Distillation

no code implementations • ICCV 2021 • Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

Knowledge distillation has become one of the most important model compression techniques by distilling knowledge from larger teacher networks to smaller student ones.

Knowledge Distillation Model Compression +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.