Communication Lower Bound in Convolution Accelerators

8 Nov 2019 · Xiaoming Chen, Yinhe Han, Yu Wang ·

In current convolutional neural network (CNN) accelerators, communication (i.e., memory access) dominates the energy consumption. This work provides comprehensive analysis and methodologies to minimize the communication for CNN accelerators. For the off-chip communication, we derive the theoretical lower bound for any convolutional layer and propose a dataflow to reach the lower bound. This fundamental problem has never been solved by prior studies. The on-chip communication is minimized based on an elaborate workload and storage mapping scheme. We in addition design a communication-optimal CNN accelerator architecture. Evaluations based on the 65nm technology demonstrate that the proposed architecture nearly reaches the theoretical minimum communication in a three-level memory hierarchy and it is computation dominant. The gap between the energy efficiency of our accelerator and the theoretical best value is only 37-87%.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Datasets

Add Datasets introduced or used in this paper

Edit Social Preview

Communication Lower Bound in Convolution Accelerators

Code Edit Add Remove Mark official

Categories

Datasets Edit

Code

Add Remove Mark official

Datasets