Search Results for author: Amir Yazdanbakhsh

Found 25 papers, 8 papers with code

Data-Driven Offline Optimization For Architecting Hardware Accelerators

1 code implementation ICLR 2022 Aviral Kumar, Amir Yazdanbakhsh, Milad Hashemi, Kevin Swersky, Sergey Levine

An alternative paradigm is to use a "data-driven", offline approach that utilizes logged simulation data, to architect hardware accelerators, without needing any form of simulations.

Computer Architecture and Systems

An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks

1 code implementation20 Feb 2021 Kiran Seshadri, Berkin Akin, James Laudon, Ravi Narayanaswami, Amir Yazdanbakhsh

Then, we extensively evaluate three classes of Edge TPUs, covering different computing ecosystems, that are either currently deployed in Google products or are the product pipeline, across 423K unique convolutional neural networks.

Learning Performance-Improving Code Edits

2 code implementations15 Feb 2023 Alexander Shypula, Aman Madaan, Yimeng Zeng, Uri Alon, Jacob Gardner, Milad Hashemi, Graham Neubig, Parthasarathy Ranganathan, Osbert Bastani, Amir Yazdanbakhsh

Next, we propose a broad range of adaptation strategies for code optimization; for prompting, these include retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.

Code Generation Code Repair +2

GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation

1 code implementation8 Oct 2022 Ondrej Sykora, Phitchaya Mangpo Phothilimthana, Charith Mendis, Amir Yazdanbakhsh

In this paper, we introduce GRANITE, a new machine learning model that estimates the throughput of basic blocks across different microarchitectures.

Multi-Task Learning

Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers

1 code implementation7 Feb 2024 Abhimanyu Rajeshkumar Bambhaniya, Amir Yazdanbakhsh, Suvinay Subramanian, Sheng-Chun Kao, Shivani Agrawal, Utku Evci, Tushar Krishna

In this work, we study the effectiveness of existing sparse training recipes at \textit{high-sparsity regions} and argue that these methods fail to sustain the model quality on par with low-sparsity regions.

GANAX: A Unified MIMD-SIMD Acceleration for Generative Adversarial Networks

no code implementations10 May 2018 Amir Yazdanbakhsh, Hajar Falahati, Philip J. Wolfe, Kambiz Samadi, Nam Sung Kim, Hadi Esmaeilzadeh

Even though there is a convolution stage in this operator, the inserted zeros lead to underutilization of the compute resources when a conventional convolution accelerator is employed.

ReLeQ: A Reinforcement Learning Approach for Deep Quantization of Neural Networks

no code implementations5 Nov 2018 Ahmed T. Elthakeb, Prannoy Pilligundla, FatemehSadat Mireshghallah, Amir Yazdanbakhsh, Hadi Esmaeilzadeh

We show how ReLeQ can balance speed and quality, and provide an asymmetric general solution for quantization of a large variety of deep networks (AlexNet, CIFAR-10, LeNet, MobileNet-V1, ResNet-20, SVHN, and VGG-11) that virtually preserves the accuracy (=< 0. 3% loss) while minimizing the computation and storage cost.

Quantization reinforcement-learning +1

Mixed-Signal Charge-Domain Acceleration of Deep Neural networks through Interleaved Bit-Partitioned Arithmetic

no code implementations27 Jun 2019 Soroush Ghodrati, Hardik Sharma, Sean Kinzer, Amir Yazdanbakhsh, Kambiz Samadi, Nam Sung Kim, Doug Burger, Hadi Esmaeilzadeh

Low-power potential of mixed-signal design makes it an alluring option to accelerate Deep Neural Networks (DNNs).

Hardware Architecture

Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation

1 code implementation ICLR 2020 Byung Hoon Ahn, Prannoy Pilligundla, Amir Yazdanbakhsh, Hadi Esmaeilzadeh

This solution dubbed Chameleon leverages reinforcement learning whose solution takes fewer steps to converge, and develops an adaptive sampling algorithm that not only focuses on the costly samples (real hardware measurements) on representative points but also uses a domain-knowledge inspired logic to improve the samples itself.

Apollo: Transferable Architecture Exploration

no code implementations2 Feb 2021 Amir Yazdanbakhsh, Christof Angermueller, Berkin Akin, Yanqi Zhou, Albin Jones, Milad Hashemi, Kevin Swersky, Satrajit Chatterjee, Ravi Narayanaswami, James Laudon

We further show that by transferring knowledge between target architectures with different design constraints, Apollo is able to find optimal configurations faster and often with better objective value (up to 25% improvements).

Policy Optimization by Local Improvement through Search

no code implementations25 Sep 2019 Jialin Song, Joe Wenjie Jiang, Amir Yazdanbakhsh, Ebrahim Songhori, Anna Goldie, Navdeep Jaitly, Azalia Mirhoseini

On the other end of the spectrum, approaches rooted in Policy Iteration, such as Dual Policy Iteration do not choose next step actions based on an expert, but instead use planning or search over the policy to choose an action distribution to train towards.

Imitation Learning reinforcement-learning +1

Accelerating Attention through Gradient-Based Learned Runtime Pruning

no code implementations7 Apr 2022 Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang

To best utilize this mathematical innovation, we devise a bit-serial architecture, dubbed LeOPArd, for transformer language models with bit-level early termination microarchitectural mechanism.

Sentence

Sparse Attention Acceleration with Synergistic In-Memory Pruning and On-Chip Recomputation

no code implementations1 Sep 2022 Amir Yazdanbakhsh, Ashkan Moradifirouzabadi, Zheng Li, Mingu Kang

The combined in-memory pruning and on-chip recompute of the relevant attention scores enables SPRINT to transform quadratic complexity to a merely linear one.

Text and Patterns: For Effective Chain of Thought, It Takes Two to Tango

no code implementations16 Sep 2022 Aman Madaan, Amir Yazdanbakhsh

Our empirical and qualitative analysis reveals that a symbiotic relationship between text and patterns explains the success of few-shot prompting: text helps extract commonsense from the question to help patterns, and patterns enforce task understanding and direct text generation.

Code Completion counterfactual +1

Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask

no code implementations15 Sep 2022 Sheng-Chun Kao, Amir Yazdanbakhsh, Suvinay Subramanian, Shivani Agrawal, Utku Evci, Tushar Krishna

In this work, we focus on N:M sparsity and extensively study and evaluate various training recipes for N:M sparsity in terms of the trade-off between model accuracy and compute cost (FLOPs).

STEP: Learning N:M Structured Sparsity Masks from Scratch with Precondition

no code implementations2 Feb 2023 Yucheng Lu, Shivani Agrawal, Suvinay Subramanian, Oleg Rybakov, Christopher De Sa, Amir Yazdanbakhsh

Recent innovations on hardware (e. g. Nvidia A100) have motivated learning N:M structured sparsity masks from scratch for fast model inference.

Machine Translation

DaCapo: Accelerating Continuous Learning in Autonomous Systems for Video Analytics

no code implementations21 Mar 2024 Yoonsung Kim, Changhun Oh, Jinwoo Hwang, Wonung Kim, Seongryong Oh, Yubin Lee, Hardik Sharma, Amir Yazdanbakhsh, Jongse Park

Deep neural network (DNN) video analytics is crucial for autonomous systems such as self-driving vehicles, unmanned aerial vehicles (UAVs), and security robots.

Tao: Re-Thinking DL-based Microarchitecture Simulation

no code implementations16 Apr 2024 Santosh Pandey, Amir Yazdanbakhsh, Hang Liu

Microarchitecture simulators are indispensable tools for microarchitecture designers to validate, estimate, and optimize new hardware that meets specific design requirements.

Transfer Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.