no code implementations • 27 May 2021 • Kaixin Zhang, Hongzhi Wang, Han Hu, Songling Zou, Jiye Qiu, Tongxin Li, Zhishun Wang
In this paper, we demonstrated TENSILE, a method of managing GPU memory in tensor granularity to reduce the GPU memory peak, considering the multiple dynamic workloads.