no code implementations • 11 Oct 2023 • Ravit Sharma, Wojciech Romaszkan, Feiqian Zhu, Puneet Gupta, Ankur Mehta
We perform this hardware/software co-design from the cost, latency, and user-experience perspective, and develop a set of guidelines for optimal system design and model deployment for the most cost-constrained platforms.
no code implementations • 1 Mar 2021 • Shurui Li, Wojciech Romaszkan, Alexander Graening, Puneet Gupta
Quantization is spearheading the increase in performance and efficiency of neural network computing systems making headway into commodity hardware.