1 code implementation • 5 Apr 2020 • Sicong Zhuang, Cristiano Malossi, Marc Casas
This paper reduces the cost of DNNs training by decreasing the amount of data movement across heterogeneous architectures composed of several GPUs and multicore CPU devices.
Distributed, Parallel, and Cluster Computing