no code implementations • 2 Apr 2019 • Sangkug Lym, Donghyuk Lee, Mike O'Connor, Niladrish Chatterjee, Mattan Erez
Training convolutional neural networks (CNNs) requires intense compute throughput and high memory bandwidth.
no code implementations • 6 Mar 2019 • Esha Choukse, Michael Sullivan, Mike O'Connor, Mattan Erez, Jeff Pool, David Nellans, Steve Keckler
However, GPU device memory tends to be relatively small and the memory capacity can not be increased by the user.
Hardware Architecture
no code implementations • 3 May 2017 • Minsoo Rhu, Mike O'Connor, Niladrish Chatterjee, Jeff Pool, Stephen W. Keckler
Popular deep learning frameworks require users to fine-tune their memory usage so that the training data of a deep neural network (DNN) fits within the GPU physical memory.