Search Results for author: Omar Mohamed Awad

Found 7 papers, 0 papers with code

SkipViT: Speeding Up Vision Transformers with a Token-Level Skip Connection

no code implementations27 Jan 2024 Foozhan Ataiefard, Walid Ahmed, Habib Hajimolahoseini, Saina Asani, Farnoosh Javadi, Mohammad Hassanpour, Omar Mohamed Awad, Austin Wen, Kangling Liu, Yang Liu

Our method does not add any parameters to the ViT model and aims to find the best trade-off between training throughput and achieving a 0% loss in the Top-1 accuracy of the final model.

SwiftLearn: A Data-Efficient Training Method of Deep Learning Models using Importance Sampling

no code implementations25 Nov 2023 Habib Hajimolahoseini, Omar Mohamed Awad, Walid Ahmed, Austin Wen, Saina Asani, Mohammad Hassanpour, Farnoosh Javadi, Mehdi Ahmadi, Foozhan Ataiefard, Kangling Liu, Yang Liu

In this paper, we present SwiftLearn, a data-efficient approach to accelerate training of deep learning models using a subset of data samples selected during the warm-up stages of training.

GQKVA: Efficient Pre-training of Transformers by Grouping Queries, Keys, and Values

no code implementations6 Nov 2023 Farnoosh Javadi, Walid Ahmed, Habib Hajimolahoseini, Foozhan Ataiefard, Mohammad Hassanpour, Saina Asani, Austin Wen, Omar Mohamed Awad, Kangling Liu, Yang Liu

We tested our method on ViT, which achieved an approximate 0. 3% increase in accuracy while reducing the model size by about 4% in the task of image classification.

Image Classification

Improving Resnet-9 Generalization Trained on Small Datasets

no code implementations7 Sep 2023 Omar Mohamed Awad, Habib Hajimolahoseini, Michael Lim, Gurpreet Gosal, Walid Ahmed, Yang Liu, Gordon Deng

This paper presents our proposed approach that won the first prize at the ICLR competition on Hardware Aware Efficient Training.

Image Classification

FPRaker: A Processing Element For Accelerating Neural Network Training

no code implementations15 Oct 2020 Omar Mohamed Awad, Mostafa Mahmoud, Isak Edo, Ali Hadi Zadeh, Ciaran Bannon, Anand Jayarajan, Gennady Pekhimenko, Andreas Moshovos

We demonstrate that FPRaker can be used to compose an accelerator for training and that it can improve performance and energy efficiency compared to using conventional floating-point units under ISO-compute area constraints.

Quantization

TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training and Inference

no code implementations1 Sep 2020 Mostafa Mahmoud, Isak Edo, Ali Hadi Zadeh, Omar Mohamed Awad, Gennady Pekhimenko, Jorge Albericio, Andreas Moshovos

TensorDash is a hardware level technique for enabling data-parallel MAC units to take advantage of sparsity in their input operand streams.

Cannot find the paper you are looking for? You can Submit a new open access paper.