1 code implementation • 8 Dec 2024 • Pu Zhao, Xuan Shen, Zhenglun Kong, Yixin Shen, Sung-En Chang, Timothy Rupprecht, Lei Lu, Enfu Nan, Changdi Yang, Yumei He, Xingchen Xu, Yu Huang, Wei Wang, Yue Chen, Yong He, Yanzhi Wang
Recently, Large Language Models (LLMs) have undergone a significant transformation, marked by a rapid rise in both their popularity and capabilities.
no code implementations • 7 Aug 2024 • Timothy Rupprecht, Sung-En Chang, Yushu Wu, Lei Lu, Enfu Nan, Chih-hsiang Li, Caiyue Lai, Zhimin Li, Zhijun Hu, Yumei He, David Kaeli, Yanzhi Wang
To better quantify how our prompting strategy affects anthropomorphic features like humor, authenticity, and favorability we present Crowd Vote - an adaptation of Crowd Score that allows for judges to elect a large language model (LLM) candidate over competitors answering the same or similar prompts.
no code implementations • 30 Oct 2021 • Sung-En Chang, Yanyu Li, Mengshu Sun, Yanzhi Wang, Xue Lin
Our proposed ILMPQ DNN quantization framework achieves 70. 73 Top1 accuracy in ResNet-18 on the ImageNet dataset.
no code implementations • ICCV 2021 • Sung-En Chang, Yanyu Li, Mengshu Sun, Weiwen Jiang, Sijia Liu, Yanzhi Wang, Xue Lin
Specifically, this is the first effort to assign mixed quantization schemes and multiple precisions within layers -- among rows of the DNN weight matrix, for simplified operations in hardware inference, while preserving accuracy.
no code implementations • ACL 2022 • Shaoyi Huang, Dongkuan Xu, Ian E. H. Yen, Yijue Wang, Sung-En Chang, Bingbing Li, Shiyang Chen, Mimi Xie, Sanguthevar Rajasekaran, Hang Liu, Caiwen Ding
Conventional wisdom in pruning Transformer-based language models is that pruning reduces the model expressiveness and thus is more likely to underfit rather than overfit.
no code implementations • 8 Dec 2020 • Sung-En Chang, Yanyu Li, Mengshu Sun, Runbin Shi, Hayden K. -H. So, Xuehai Qian, Yanzhi Wang, Xue Lin
Unlike existing methods that use the same quantization scheme for all weights, we propose the first solution that applies different quantization schemes for different rows of the weight matrix.
no code implementations • 16 Sep 2020 • Sung-En Chang, Yanyu Li, Mengshu Sun, Weiwen Jiang, Runbin Shi, Xue Lin, Yanzhi Wang
To tackle the limited computing and storage resources in edge devices, model compression techniques have been widely used to trim deep neural network (DNN) models for on-device inference execution.
no code implementations • NeurIPS 2018 • Ian En-Hsu Yen, Wei-Cheng Lee, Kai Zhong, Sung-En Chang, Pradeep K. Ravikumar, Shou-De Lin
We consider a generalization of mixed regression where the response is an additive combination of several mixture components.
no code implementations • 10 Oct 2018 • Sung-En Chang, Xun Zheng, Ian E. H. Yen, Pradeep Ravikumar, Rose Yu
Tensor decomposition has been extensively used as a tool for exploratory analysis.
no code implementations • ICML 2017 • Ian En-Hsu Yen, Wei-Cheng Lee, Sung-En Chang, Arun Sai Suggala, Shou-De Lin, Pradeep Ravikumar
The latent feature model (LFM), proposed in (Griffiths \& Ghahramani, 2005), but possibly with earlier origins, is a generalization of a mixture model, where each instance is generated not from a single latent class but from a combination of latent features.