Search Results for author: Jeffrey L. McKinstry

Found 5 papers, 1 papers with code

Efficient and Effective Methods for Mixed Precision Neural Network Quantization for Faster, Energy-efficient Inference

no code implementations • 30 Jan 2023 • Deepika Bablani, Jeffrey L. McKinstry, Steven K. Esser, Rathinakumar Appuswamy, Dharmendra S. Modha

Using EAGL and ALPS for layer precision selection, full-precision accuracy is recovered with a mix of 4-bit and 2-bit layers for ResNet-50, ResNet-101 and BERT-base transformer networks, demonstrating enhanced performance across the entire accuracy-throughput frontier.

Efficient Neural Network Quantization

Paper
Add Code

Learned Step Size Quantization

8 code implementations • ICLR 2020 • Steven K. Esser, Jeffrey L. McKinstry, Deepika Bablani, Rathinakumar Appuswamy, Dharmendra S. Modha

Deep networks run with low precision operations at inference time offer power and space advantages over high precision alternatives, but need to overcome the challenge of maintaining high accuracy as precision decreases.

Ranked #1 on Model Compression on ImageNet

Model Compression Quantization

246

Paper
Code

Low Precision Policy Distillation with Application to Low-Power, Real-time Sensation-Cognition-Action Loop with Neuromorphic Computing

no code implementations • 25 Sep 2018 • Jeffrey L. McKinstry, Davis R. Barch, Deepika Bablani, Michael V. Debole, Steven K. Esser, Jeffrey A. Kusnitz, John V. Arthur, Dharmendra S. Modha

Low precision networks in the reinforcement learning (RL) setting are relatively unexplored because of the limitations of binary activations for function approximation.

Atari Games reinforcement-learning +1

Paper
Add Code

Discovering Low-Precision Networks Close to Full-Precision Networks for Efficient Embedded Inference

no code implementations • ICLR 2019 • Jeffrey L. McKinstry, Steven K. Esser, Rathinakumar Appuswamy, Deepika Bablani, John V. Arthur, Izzet B. Yildiz, Dharmendra S. Modha

Therefore, we (a) reduce solution distance by starting with pretrained fp32 precision baseline networks and fine-tuning, and (b) combat gradient noise introduced by quantization by training longer and reducing learning rates.

General Classification Quantization

Paper
Add Code

Convolutional Networks for Fast, Energy-Efficient Neuromorphic Computing

no code implementations • 28 Mar 2016 • Steven K. Esser, Paul A. Merolla, John V. Arthur, Andrew S. Cassidy, Rathinakumar Appuswamy, Alexander Andreopoulos, David J. Berg, Jeffrey L. McKinstry, Timothy Melano, Davis R. Barch, Carmelo Di Nolfo, Pallab Datta, Arnon Amir, Brian Taba, Myron D. Flickner, Dharmendra S. Modha

Deep networks are now able to achieve human-level performance on a broad spectrum of recognition tasks.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.