no code implementations • 22 Mar 2024 • Yuzhang Shang, Mu Cai, Bingxin Xu, Yong Jae Lee, Yan Yan
Based on this, we propose PruMerge, a novel adaptive visual token reduction approach, which largely reduces the number of visual tokens while maintaining comparable model performance.
no code implementations • 15 Mar 2024 • Zhixing Hou, Yuzhang Shang, Yan Yan
This paper presents a novel Fully Binary Point Cloud Transformer (FBPT) model which has the potential to be widely applied and expanded in the fields of robotics and mobile devices.
no code implementations • 10 Mar 2024 • Bin Duan, Yuzhang Shang, Dawen Cai, Yan Yan
In this paper, we propose an online multi-spectral neuron tracing method with uniquely designed modules, where no offline training are required.
2 code implementations • 26 Feb 2024 • Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen Dong, Zhe Zhou, Chenhao Xue, Bingzhe Wu, Zhikai Li, Qingyi Gu, Yong Jae Lee, Yan Yan, Beidi Chen, Guangyu Sun, Kurt Keutzer
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model for systematic analysis of LLM inference techniques.
1 code implementation • 6 Feb 2024 • Haoxuan Wang, Yuzhang Shang, Zhihang Yuan, Junyi Wu, Yan Yan
Diffusion models have achieved remarkable success in image generation tasks, yet their practical deployment is restrained by the high memory and time consumption.
1 code implementation • NeurIPS 2023 • Yuzhang Shang, Zhihang Yuan, Yan Yan
Thus, we introduce mutual information (MI) as the metric to quantify the shared information between the synthetic and the real datasets, and devise MIM4DD numerically maximizing the MI via a newly designed optimizable objective within a contrastive learning framework to update the synthetic dataset.
1 code implementation • 10 Dec 2023 • Zhihang Yuan, Yuzhang Shang, Yue Song, Qiang Wu, Yan Yan, Guangyu Sun
This paper explores a new post-hoc training-free compression paradigm for compressing Large Language Models (LLMs) to facilitate their wider adoption in various computing environments.
2 code implementations • 29 Sep 2023 • Yuzhang Shang, Zhihang Yuan, Qiang Wu, Zhen Dong
This paper explores network binarization, a radical form of quantization, compressing model weights to a single bit, specifically for Large Language Models (LLMs) compression.
1 code implementation • ICCV 2023 • Yuzhang Shang, Bingxin Xu, Gaowen Liu, Ramana Kompella, Yan Yan
Inspired by the causal understanding, we propose the Causality-guided Data-free Network Quantization method, Causal-DFQ, to eliminate the reliance on data via approaching an equilibrium of causality-driven intervened distributions.
1 code implementation • 3 Apr 2023 • Zhihang Yuan, Lin Niu, Jiawei Liu, Wenyu Liu, Xinggang Wang, Yuzhang Shang, Guangyu Sun, Qiang Wu, Jiaxiang Wu, Bingzhe Wu
In this paper, we identify that the challenge in quantizing activations in LLMs arises from varying ranges across channels, rather than solely the presence of outliers.
no code implementations • 2 Mar 2023 • Zhixing Hou, Yuzhang Shang, Tian Gao, Yan Yan
To solve this issue, we propose a binary point cloud transformer for place recognition.
1 code implementation • CVPR 2023 • Yuzhang Shang, Zhihang Yuan, Bin Xie, Bingzhe Wu, Yan Yan
These approaches define a forward diffusion process for transforming data into noise and a backward denoising process for sampling data from noise.
1 code implementation • 13 Jul 2022 • Yuzhang Shang, Dan Xu, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan
Relying on the premise that the performance of a binary neural network can be largely restored with eliminated quantization error between full-precision weight vectors and their corresponding binary vectors, existing works of network binarization frequently adopt the idea of model robustness to reach the aforementioned objective.
1 code implementation • 6 Jul 2022 • Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan
Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.
1 code implementation • 23 Apr 2022 • Zhenghao Zhao, Ye Zhu, Xiaoguang Zhu, Yuzhang Shang, Yan Yan
Most current AI systems rely on the premise that the input visual data are sufficient to achieve competitive performance in various computer vision tasks.
no code implementations • 30 Jan 2022 • Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan
Extensive experiments on CIFAR-10 and CIFAR-100 demonstrate the superiority of our novel Fourier analysis based MBP compared to other traditional MBP algorithms.
no code implementations • 29 Sep 2021 • Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan
Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.
no code implementations • ICCV 2021 • Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan
Knowledge distillation has become one of the most important model compression techniques by distilling knowledge from larger teacher networks to smaller student ones.