no code implementations • 8 Mar 2022 • Seung-Hun Chung, Tarek S. Abdelrahman
We improve the quality of the generated hardware with optimizations applied to the base OpenCL kernels generated by TVM.
no code implementations • 29 Dec 2019 • Lifu Zhang, Tarek S. Abdelrahman
We use 4 CNNs (LeNet-5, AlexNet, VGG and ResNet) and show that when pipelining is limited to early layers in a network, training with stale weights converges and results in models with comparable inference accuracies to those resulting from non-pipelined training on MNIST and CIFAR-10 datasets; a drop in accuracy of 0. 4%, 4%, 0. 83% and 1. 45% for the 4 networks, respectively.
no code implementations • NIPS Workshop CDNNRIA 2018 • Amir H. Ashouri, Tarek S. Abdelrahman, Alwyn Dos Remedios
Our methods are applied on-the-fly and require no retraining.