no code implementations • 2 Jun 2024 • Aozhong zhang, Naigang Wang, Yanxia Deng, Xin Li, Zi Yang, Penghang Yin
For example, we achieve a Wikitext2 perplexity of 5. 95 on the LLaMA2-70B model for per-channel INT2 weight quantization without incurring any inference overhead.
1 code implementation • 11 Mar 2024 • Aozhong zhang, Zi Yang, Naigang Wang, Yingyong Qin, Jack Xin, Xin Li, Penghang Yin
Within a fixed layer, COMQ treats all the scaling factor(s) and bit-codes as the variables of the reconstruction error.
no code implementations • 10 Feb 2023 • Zhijian Li, Biao Yang, Penghang Yin, Yingyong Qi, Jack Xin
In this paper, we propose a feature affinity (FA) assisted knowledge distillation (KD) method to improve quantization-aware training of deep neural networks (DNN).
no code implementations • 10 Dec 2020 • Ziang Long, Penghang Yin, Jack Xin
Deep neural networks (DNNs) are quantized for efficient inference on resource-constrained platforms.
no code implementations • 23 Nov 2020 • Ziang Long, Penghang Yin, Jack Xin
In this paper, we propose a class of STEs with certain monotonicity, and consider their applications to the training of a two-linear-layer network with quantized activation functions for non-linear multi-category classification.
no code implementations • 28 Feb 2020 • Ziang Long, Penghang Yin, Jack Xin
In this paper, we study the dynamics of gradient descent in learning neural networks for classification problems.
no code implementations • ICLR 2019 • Penghang Yin, Jiancheng Lyu, Shuai Zhang, Stanley Osher, Yingyong Qi, Jack Xin
We prove that if the STE is properly chosen, the expected coarse gradient correlates positively with the population gradient (not available for the training), and its negation is a descent direction for minimizing the population loss.
no code implementations • 5 Nov 2018 • Tao Sun, Penghang Yin, Dongsheng Li, Chun Huang, Lei Guan, Hao Jiang
For objective functions satisfying a relaxed strongly convex condition, the linear convergence is established under weaker assumptions on the step size and inertial parameter than made in the existing literature.
1 code implementation • 23 Sep 2018 • Bao Wang, Alex T. Lin, Wei Zhu, Penghang Yin, Andrea L. Bertozzi, Stanley J. Osher
We improve the robustness of Deep Neural Net (DNN) to adversarial attacks by using an interpolating function as the output activation.
no code implementations • 15 Aug 2018 • Penghang Yin, Shuai Zhang, Jiancheng Lyu, Stanley Osher, Yingyong Qi, Jack Xin
We introduce the notion of coarse gradient and propose the blended coarse gradient descent (BCGD) algorithm, for training fully quantized neural networks.
1 code implementation • 17 Jun 2018 • Stanley Osher, Bao Wang, Penghang Yin, Xiyang Luo, Farzin Barekat, Minh Pham, Alex Lin
We propose a class of very simple modifications of gradient descent and stochastic gradient descent.
2 code implementations • 19 Jan 2018 • Penghang Yin, Shuai Zhang, Jiancheng Lyu, Stanley Osher, Yingyong Qi, Jack Xin
We propose BinaryRelax, a simple two-phase algorithm, for training deep neural networks with quantized weights.
no code implementations • 23 Nov 2017 • Bao Wang, Penghang Yin, Andrea L. Bertozzi, P. Jeffrey Brantingham, Stanley J. Osher, Jack Xin
In this work, we first present a proper representation of crime data.
no code implementations • 21 Oct 2017 • Penghang Yin, Minh Pham, Adam Oberman, Stanley Osher
In this paper, we propose an implicit gradient descent algorithm for the classic $k$-means problem.
no code implementations • 19 Dec 2016 • Penghang Yin, Shuai Zhang, Yingyong Qi, Jack Xin
We present LBW-Net, an efficient optimization based method for quantization and training of the low bit-width convolutional neural networks (CNNs).