no code implementations • 26 Feb 2020 • Maximilian Lam, Zachary Yedidia, Colby Banbury, Vijay Janapa Reddi
We present PrecisionBatching, a quantized inference algorithm for speeding up neural network execution on traditional hardware platforms at low bitwidths without the need for retraining or recalibration.