no code implementations • 28 Oct 2024 • Shih-Yang Liu, Huck Yang, Chien-Yi Wang, Nai Chit Fung, Hongxu Yin, Charbel Sakr, Saurav Muralidharan, Kwang-Ting Cheng, Jan Kautz, Yu-Chiang Frank Wang, Pavlo Molchanov, Min-Hung Chen
In this work, we re-formulate the model compression problem into the customized compensation problem: Given a compressed model, we aim to introduce residual low-rank paths to compensate for compression errors under customized requirements from users (e. g., tasks, compression ratios), resulting in greater flexibility in adjusting overall capacity without being constrained by specific compression formats.
no code implementations • 7 Oct 2024 • Charbel Sakr, Brucek Khailany
The activation-centrality of the approach enables retraining LLMs with no loss of expressivity; while at inference, weight decomposition is obtained as a byproduct of matrix multiplication associativity.
1 code implementation • 13 Jun 2022 • Charbel Sakr, Steve Dai, Rangharajan Venkatesan, Brian Zimmer, William J. Dally, Brucek Khailany
Data clipping is crucial in reducing noise in quantization operations and improving the achievable accuracy of quantization-aware training (QAT).
no code implementations • 22 Feb 2020 • Abdulrahman Mahmoud, Siva Kumar Sastry Hari, Christopher W. Fletcher, Sarita V. Adve, Charbel Sakr, Naresh Shanbhag, Pavlo Molchanov, Michael B. Sullivan, Timothy Tsai, Stephen W. Keckler
As Convolutional Neural Networks (CNNs) are increasingly being employed in safety-critical applications, it is important that they behave reliably in the face of hardware errors.
no code implementations • ICLR 2019 • Charbel Sakr, Naigang Wang, Chia-Yu Chen, Jungwook Choi, Ankur Agrawal, Naresh Shanbhag, Kailash Gopalakrishnan
Observing that a bad choice for accumulation precision results in loss of information that manifests itself as a reduction in variance in an ensemble of partial sums, we derive a set of equations that relate this variance to the length of accumulation and the minimum number of bits needed for accumulation.
no code implementations • ICLR 2019 • Charbel Sakr, Naresh Shanbhag
The high computational and parameter complexity of neural networks makes their training very slow and difficult to deploy on energy and storage-constrained computing systems.
no code implementations • ICML 2017 • Charbel Sakr, Yongjune Kim, Naresh Shanbhag
We focus on numerical precision – a key parameter defining the complexity of neural networks.
no code implementations • 3 Jul 2016 • Charbel Sakr, Ameya Patil, Sai Zhang, Yongjune Kim, Naresh Shanbhag
Lower bounds on the data precision are derived in terms of the the desired classification accuracy and precision of the hyperparameters used in the classifier.