A Co-design view of Compute in-Memory with Non-Volatile Elements for Neural Networks

Deep Learning neural networks are pervasive, but traditional computer architectures are reaching the limits of being able to efficiently execute them for the large workloads of today. They are limited by the von Neumann bottleneck: the high cost in energy and latency incurred in moving data between memory and the compute engine. Today, special CMOS designs address this bottleneck. The next generation of computing hardware will need to eliminate or dramatically mitigate this bottleneck. We discuss how compute-in-memory can play an important part in this development. Here, a non-volatile memory based cross-bar architecture forms the heart of an engine that uses an analog process to parallelize the matrix vector multiplication operation, repeatedly used in all neural network workloads. The cross-bar architecture, at times referred to as a neuromorphic approach, can be a key hardware element in future computing machines. In the first part of this review we take a co-design view of the design constraints and the demands it places on the new materials and memory devices that anchor the cross-bar architecture. In the second part, we review what is knows about the different new non-volatile memory materials and devices suited for compute in-memory, and discuss the outlook and challenges.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here