Search Results for author: Brian Zimmer

Found 4 papers, 1 papers with code

Optimal Clipping and Magnitude-aware Differentiation for Improved Quantization-aware Training

no code implementations13 Jun 2022 Charbel Sakr, Steve Dai, Rangharajan Venkatesan, Brian Zimmer, William J. Dally, Brucek Khailany

Data clipping is crucial in reducing noise in quantization operations and improving the achievable accuracy of quantization-aware training (QAT).

Quantization

VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference

no code implementations8 Feb 2021 Steve Dai, Rangharajan Venkatesan, Haoxing Ren, Brian Zimmer, William J. Dally, Brucek Khailany

4-bit weights and 8-bit activations achieve near-full-precision accuracy for both BERT-base and BERT-large on SQuAD while reducing area by 26% compared to an 8-bit baseline.

Math Quantization

Analog/Mixed-Signal Hardware Error Modeling for Deep Learning Inference

1 code implementation Design Automation Conference (DAC) 2019 Angad S. Rekhi, Brian Zimmer, Nikola Nedovic, Ningxi Liu, Rangharajan Venkatesan, Miaorong Wang, Brucek Khailany, William J. Dally, C. Thomas Gray

We also introduce an energy model to predict the requirements of high-accuracy AMS hardware running large networks and use it to show that for ADC-dominated designs, there is a direct tradeoff between energy efficiency and network accuracy.

Cannot find the paper you are looking for? You can Submit a new open access paper.