no code implementations • 4 Apr 2024 • Andrew Lavin
Hence researchers switched the objective of their search from arithmetic complexity to latency and produced a new wave of models that performed better.
5 code implementations • CVPR 2016 • Andrew Lavin, Scott Gray
The algorithms compute minimal complexity convolution over small tiles, which makes them fast with small filters and small batch sizes.
1 code implementation • 27 Jan 2015 • Andrew Lavin
This paper describes maxDNN, a computationally efficient convolution kernel for deep learning with the NVIDIA Maxwell GPU.