1 code implementation • 19 Oct 2022 • Muralidhar Andoorveedu, Zhanda Zhu, Bojian Zheng, Gennady Pekhimenko
We implement Tempo and evaluate the throughput, memory usage, and accuracy/loss on the BERT Large pre-training task.
2 code implementations • 18 Oct 2022 • Yaoyao Ding, Cody Hao Yu, Bojian Zheng, Yizhi Liu, Yida Wang, Gennady Pekhimenko
With the proposed paradigm, we implement a deep learning compiler Hidet.
no code implementations • 22 May 2018 • Bojian Zheng, Abhishek Tiwari, Nandita Vijaykumar, Gennady Pekhimenko
For each feature map recomputation to be effective and efficient, its effect on (1) the total memory footprint, and (2) the total execution time has to be carefully estimated.
no code implementations • 16 Mar 2018 • Hongyu Zhu, Mohamed Akrout, Bojian Zheng, Andrew Pelegris, Amar Phanishayee, Bianca Schroeder, Gennady Pekhimenko
Our primary goal in this work is to break this myopic view by (i) proposing a new benchmark for DNN training, called TBD (TBD is short for Training Benchmark for DNNs), that uses a representative set of DNN models that cover a wide range of machine learning applications: image classification, machine translation, speech recognition, object detection, adversarial networks, reinforcement learning, and (ii) by performing an extensive performance analysis of training these different applications on three major deep learning frameworks (TensorFlow, MXNet, CNTK) across different hardware configurations (single-GPU, multi-GPU, and multi-machine).