no code implementations • 7 Jan 2019 • Lingchuan Meng, John Brothers
Quantized neural networks can effectively reduce model sizes and improve inference speed, which leads to a wide variety of kernels and hardware accelerators that work with integer data.
no code implementations • 27 Jul 2018 • Jin Hee Kim, Brett Grady, Ruolong Lian, John Brothers, Jason H. Anderson
A deep-learning inference accelerator is synthesized from a C-language software program parallelized with Pthreads.