Search Results for author: Paulius Micikevicius

Found 9 papers, 8 papers with code

Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation

2 code implementations20 Apr 2020 Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev, Paulius Micikevicius

Quantization techniques can reduce the size of Deep Neural Networks and improve inference latency and throughput by taking advantage of high throughput integer instructions.

Math Quantization

Accelerating Sparse Deep Neural Networks

2 code implementations16 Apr 2021 Asit Mishra, Jorge Albericio Latorre, Jeff Pool, Darko Stosic, Dusan Stosic, Ganesh Venkatesh, Chong Yu, Paulius Micikevicius

We present the design and behavior of Sparse Tensor Cores, which exploit a 2:4 (50%) sparsity pattern that leads to twice the math throughput of dense matrix units.

Math

Cannot find the paper you are looking for? You can Submit a new open access paper.