Search Results for author: Charbel Sakr

Found 8 papers, 1 papers with code

EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation

no code implementations28 Oct 2024 Shih-Yang Liu, Huck Yang, Chien-Yi Wang, Nai Chit Fung, Hongxu Yin, Charbel Sakr, Saurav Muralidharan, Kwang-Ting Cheng, Jan Kautz, Yu-Chiang Frank Wang, Pavlo Molchanov, Min-Hung Chen

In this work, we re-formulate the model compression problem into the customized compensation problem: Given a compressed model, we aim to introduce residual low-rank paths to compensate for compression errors under customized requirements from users (e. g., tasks, compression ratios), resulting in greater flexibility in adjusting overall capacity without being constrained by specific compression formats.

ARC Math +3

ESPACE: Dimensionality Reduction of Activations for Model Compression

no code implementations7 Oct 2024 Charbel Sakr, Brucek Khailany

The activation-centrality of the approach enables retraining LLMs with no loss of expressivity; while at inference, weight decomposition is obtained as a byproduct of matrix multiplication associativity.

Dimensionality Reduction Model Compression +1

Optimal Clipping and Magnitude-aware Differentiation for Improved Quantization-aware Training

1 code implementation13 Jun 2022 Charbel Sakr, Steve Dai, Rangharajan Venkatesan, Brian Zimmer, William J. Dally, Brucek Khailany

Data clipping is crucial in reducing noise in quantization operations and improving the achievable accuracy of quantization-aware training (QAT).

Quantization

HarDNN: Feature Map Vulnerability Evaluation in CNNs

no code implementations22 Feb 2020 Abdulrahman Mahmoud, Siva Kumar Sastry Hari, Christopher W. Fletcher, Sarita V. Adve, Charbel Sakr, Naresh Shanbhag, Pavlo Molchanov, Michael B. Sullivan, Timothy Tsai, Stephen W. Keckler

As Convolutional Neural Networks (CNNs) are increasingly being employed in safety-critical applications, it is important that they behave reliably in the face of hardware errors.

Decision Making

Accumulation Bit-Width Scaling For Ultra-Low Precision Training Of Deep Networks

no code implementations ICLR 2019 Charbel Sakr, Naigang Wang, Chia-Yu Chen, Jungwook Choi, Ankur Agrawal, Naresh Shanbhag, Kailash Gopalakrishnan

Observing that a bad choice for accumulation precision results in loss of information that manifests itself as a reduction in variance in an ensemble of partial sums, we derive a set of equations that relate this variance to the length of accumulation and the minimum number of bits needed for accumulation.

Per-Tensor Fixed-Point Quantization of the Back-Propagation Algorithm

no code implementations ICLR 2019 Charbel Sakr, Naresh Shanbhag

The high computational and parameter complexity of neural networks makes their training very slow and difficult to deploy on energy and storage-constrained computing systems.

Quantization

Analytical Guarantees on Numerical Precision of Deep Neural Networks

no code implementations ICML 2017 Charbel Sakr, Yongjune Kim, Naresh Shanbhag

We focus on numerical precision – a key parameter defining the complexity of neural networks.

Understanding the Energy and Precision Requirements for Online Learning

no code implementations3 Jul 2016 Charbel Sakr, Ameya Patil, Sai Zhang, Yongjune Kim, Naresh Shanbhag

Lower bounds on the data precision are derived in terms of the the desired classification accuracy and precision of the hyperparameters used in the classifier.

General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.