Search Results for author: Mohamed S. Abdelfattah

Found 17 papers, 6 papers with code

Encodings for Prediction-based Neural Architecture Search

1 code implementation • 4 Mar 2024 • Yash Akhauri, Mohamed S. Abdelfattah

Building on our study, we present our predictor \textbf{FLAN}: \textbf{Fl}ow \textbf{A}ttention for \textbf{N}AS.

Neural Architecture Search Transfer Learning

Paper
Code

On Latency Predictors for Neural Architecture Search

1 code implementation • 4 Mar 2024 • Yash Akhauri, Mohamed S. Abdelfattah

We then design a general latency predictor to comprehensively study (1) the predictor architecture, (2) NN sample selection methods, (3) hardware device representations, and (4) NN operation encoding schemes.

Hardware Aware Neural Architecture Search Meta-Learning +2

Paper
Code

Beyond Inference: Performance Analysis of DNN Server Overheads for Computer Vision

no code implementations • 2 Mar 2024 • Ahmed F. AbouElhamayed, Susanne Balle, Deshanand Singh, Mohamed S. Abdelfattah

Our results consistently demonstrate that end-to-end application performance can easily be dominated by data processing and data movement functions (up to 56% of end-to-end latency in a medium-sized image, and $\sim$ 80% impact on system throughput in a large image), even though these functions have been conventionally overlooked in deep learning system design.

Depth Estimation Image Classification

Paper
Add Code

FLIQS: One-Shot Mixed-Precision Floating-Point and Integer Quantization Search

no code implementations • 7 Aug 2023 • Jordan Dotzel, Gang Wu, Andrew Li, Muhammad Umar, Yun Ni, Mohamed S. Abdelfattah, Zhiru Zhang, Liqun Cheng, Martin G. Dixon, Norman P. Jouppi, Quoc V. Le, Sheng Li

With the proposed integer quantization search, we increase the accuracy of ResNet-18 on ImageNet by 1. 31% points and ResNet-50 by 0. 90% points with equivalent model cost over previous methods.

Quantization

Paper
Add Code

DiviML: A Module-based Heuristic for Mapping Neural Networks onto Heterogeneous Platforms

no code implementations • 31 Jul 2023 • Yassine Ghannane, Mohamed S. Abdelfattah

We evaluate our scheduler in optimizing both conventional DNNs and randomly-wired neural networks, subject to latency and throughput constraints, on a heterogeneous system comprised of a CPU and two distinct GPUs.

Paper
Add Code

Multi-Predict: Few Shot Predictors For Efficient Neural Architecture Search

no code implementations • 4 Jun 2023 • Yash Akhauri, Mohamed S. Abdelfattah

Many hardware-aware neural architecture search (NAS) methods have been developed to optimize the topology of neural networks (NN) with the joint objectives of higher accuracy and lower latency.

Hardware Aware Neural Architecture Search Meta-Learning +1

Paper
Add Code

PQA: Exploring the Potential of Product Quantization in DNN Hardware Acceleration

1 code implementation • 25 May 2023 • Ahmed F. AbouElhamayed, Angela Cui, Javier Fernandez-Marques, Nicholas D. Lane, Mohamed S. Abdelfattah

We identify PQ configurations that improve performance-per-area for ResNet20 by up to 3. 1$\times$, even when compared to a highly optimized conventional DNN accelerator, with similar improvements on two additional compact DNNs.

Quantization

Paper
Code

Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design

no code implementations • 20 Sep 2022 • Hongxiang Fan, Thomas Chau, Stylianos I. Venieris, Royson Lee, Alexandros Kouris, Wayne Luk, Nicholas D. Lane, Mohamed S. Abdelfattah

By jointly optimizing the algorithm and hardware, our FPGA-based butterfly accelerator achieves 14. 2 to 23. 2 times speedup over state-of-the-art accelerators normalized to the same computational budget.

Paper
Add Code

Logic Shrinkage: Learned FPGA Netlist Sparsity for Efficient Neural Network Inference

1 code implementation • 4 Dec 2021 • Erwei Wang, James J. Davis, Georgios-Ilias Stavrou, Peter Y. K. Cheung, George A. Constantinides, Mohamed S. Abdelfattah

To address these issues, we propose logic shrinkage, a fine-grained netlist pruning methodology enabling K to be automatically learned for every LUT in a neural network targeted for FPGA inference.

Efficient Neural Network

Paper
Code

Temporal Kernel Consistency for Blind Video Super-Resolution

no code implementations • 18 Aug 2021 • Lichuan Xiang, Royson Lee, Mohamed S. Abdelfattah, Nicholas D. Lane, Hongkai Wen

Deep learning-based blind super-resolution (SR) methods have recently achieved unprecedented performance in upscaling frames with unknown degradation.

Blind Super-Resolution Video Super-Resolution

Paper
Add Code

Zero-Cost Operation Scoring in Differentiable Architecture Search

no code implementations • 12 Jun 2021 • Lichuan Xiang, Łukasz Dudziak, Mohamed S. Abdelfattah, Thomas Chau, Nicholas D. Lane, Hongkai Wen

From this perspective, we introduce a novel \textit{perturbation-based zero-cost operation scoring} (Zero-Cost-PT) approach, which utilizes zero-cost proxies that were recently studied in multi-trial NAS but degrade significantly on larger search spaces, typical for differentiable NAS.

Neural Architecture Search

Paper
Add Code

Zero-Cost Proxies for Lightweight NAS

2 code implementations • ICLR 2021 • Mohamed S. Abdelfattah, Abhinav Mehrotra, Łukasz Dudziak, Nicholas D. Lane

For example, Spearman's rank correlation coefficient between final validation accuracy and our best zero-cost proxy on NAS-Bench-201 is 0. 82, compared to 0. 61 for EcoNAS (a recently proposed reduced-training proxy).

Neural Architecture Search

216

Paper
Code

Iterative Compression of End-to-End ASR Model using AutoML

no code implementations • 6 Aug 2020 • Abhinav Mehrotra, Łukasz Dudziak, Jinsu Yeo, Young-Yoon Lee, Ravichander Vipperla, Mohamed S. Abdelfattah, Sourav Bhattacharya, Samin Ishtiaq, Alberto Gil C. P. Ramos, SangJeong Lee, Daehyun Kim, Nicholas D. Lane

Increasing demand for on-device Automatic Speech Recognition (ASR) systems has resulted in renewed interests in developing automatic model compression techniques.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

BRP-NAS: Prediction-based NAS using GCNs

2 code implementations • NeurIPS 2020 • Łukasz Dudziak, Thomas Chau, Mohamed S. Abdelfattah, Royson Lee, Hyeji Kim, Nicholas D. Lane

What is more, we investigate prediction quality on different metrics and show that sample efficiency of the predictor-based NAS can be improved by considering binary relations of models and an iterative data selection strategy.

Neural Architecture Search

Paper
Code

Best of Both Worlds: AutoML Codesign of a CNN and its Hardware Accelerator

no code implementations • 11 Feb 2020 • Mohamed S. Abdelfattah, Łukasz Dudziak, Thomas Chau, Royson Lee, Hyeji Kim, Nicholas D. Lane

We automate HW-CNN codesign using NAS by including parameters from both the CNN model and the HW accelerator, and we jointly search for the best model-accelerator pair that boosts accuracy and efficiency.

General Classification Image Classification +2

Paper
Add Code

ShrinkML: End-to-End ASR Model Compression Using Reinforcement Learning

no code implementations • 8 Jul 2019 • Łukasz Dudziak, Mohamed S. Abdelfattah, Ravichander Vipperla, Stefanos Laskaridis, Nicholas D. Lane

Our results show that in the absence of retraining our RL-based search is an effective and practical method to compress a production-grade ASR system.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Add Code

DLA: Compiler and FPGA Overlay for Neural Network Inference Acceleration

no code implementations • 13 Jul 2018 • Mohamed S. Abdelfattah, David Han, Andrew Bitar, Roberto DiCecco, Shane OConnell, Nitika Shanker, Joseph Chu, Ian Prins, Joshua Fender, Andrew C. Ling, Gordon R. Chiu

Overlays have shown significant promise for field-programmable gate-arrays (FPGAs) as they allow for fast development cycles and remove many of the challenges of the traditional FPGA hardware design flow.

Distributed, Parallel, and Cluster Computing Hardware Architecture Signal Processing

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.