A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law

30 May 2022 · Chen Li, Antonios Tsourdos, Weisi Guo ·

Deep Learning (DL) has transformed the automation of a wide range of industries and finds increasing ubiquity in society. The high complexity of DL models and its widespread adoption has led to global energy consumption doubling every 3-4 months. Currently, the relationship between the DL model configuration and energy consumption is not well established. At a general computational energy model level, there is both strong dependency to both the hardware architecture (e.g. generic processors with different configuration of inner components- CPU and GPU, programmable integrated circuits - FPGA), as well as different interacting energy consumption aspects (e.g., data movement, calculation, control). At the DL model level, we need to translate non-linear activation functions and its interaction with data into calculation tasks. Current methods mainly linearize nonlinear DL models to approximate its theoretical FLOPs and MACs as a proxy for energy consumption. Yet, this is inaccurate (est. 93\% accuracy) due to the highly nonlinear nature of many convolutional neural networks (CNNs) for example. In this paper, we develop a bottom-level Transistor Operations (TOs) method to expose the role of non-linear activation functions and neural network structure in energy consumption. We translate a range of feedforward and CNN models into ALU calculation tasks and then TO steps. This is then statistically linked to real energy consumption values via a regression model for different hardware configurations and data sets. We show that our proposed TOs method can achieve a 93.61% - 99.51% precision in predicting its energy consumption.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Edit

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

A Transistor Operations Model for Deep Learning Energy Consumption Scaling Law

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove