Search Results for author: Amir Yazdanbakhsh

Found 25 papers, 8 papers with code

Data-Driven Offline Optimization For Architecting Hardware Accelerators

1 code implementation • ICLR 2022 • Aviral Kumar, Amir Yazdanbakhsh, Milad Hashemi, Kevin Swersky, Sergey Levine

An alternative paradigm is to use a "data-driven", offline approach that utilizes logged simulation data, to architect hardware accelerators, without needing any form of simulations.

Computer Architecture and Systems

32,752

Paper
Code

An Evaluation of Edge TPU Accelerators for Convolutional Neural Networks

1 code implementation • 20 Feb 2021 • Kiran Seshadri, Berkin Akin, James Laudon, Ravi Narayanaswami, Amir Yazdanbakhsh

Then, we extensively evaluate three classes of Edge TPUs, covering different computing ecosystems, that are either currently deployed in Google products or are the product pipeline, across 423K unique convolutional neural networks.

32,745

Paper
Code

Self-Refine: Iterative Refinement with Self-Feedback

2 code implementations • NeurIPS 2023 • Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, Peter Clark

Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement.

Mathematical Reasoning Response Generation

1,461

Paper
Code

JaxPruner: A concise library for sparsity research

1 code implementation • 27 Apr 2023 • Joo Hyung Lee, Wonpyo Park, Nicole Mitchell, Jonathan Pilault, Johan Obando-Ceron, Han-Byul Kim, Namhoon Lee, Elias Frantar, Yun Long, Amir Yazdanbakhsh, Shivani Agrawal, Suvinay Subramanian, Xin Wang, Sheng-Chun Kao, Xingyao Zhang, Trevor Gale, Aart Bik, Woohyun Han, Milen Ferev, Zhonglin Han, Hong-Seok Kim, Yann Dauphin, Gintare Karolina Dziugaite, Pablo Samuel Castro, Utku Evci

This paper introduces JaxPruner, an open-source JAX-based pruning and sparse training library for machine learning research.

196

Paper
Code

Learning Performance-Improving Code Edits

2 code implementations • 15 Feb 2023 • Alexander Shypula, Aman Madaan, Yimeng Zeng, Uri Alon, Jacob Gardner, Milad Hashemi, Graham Neubig, Parthasarathy Ranganathan, Osbert Bastani, Amir Yazdanbakhsh

Next, we propose a broad range of adaptation strategies for code optimization; for prompting, these include retrieval-based few-shot prompting and chain-of-thought, and for finetuning, these include performance-conditioned generation and synthetic data augmentation based on self-play.

Code Generation Code Repair +2

Paper
Code

GRANITE: A Graph Neural Network Model for Basic Block Throughput Estimation

1 code implementation • 8 Oct 2022 • Ondrej Sykora, Phitchaya Mangpo Phothilimthana, Charith Mendis, Amir Yazdanbakhsh

In this paper, we introduce GRANITE, a new machine learning model that estimates the throughput of basic blocks across different microarchitectures.

Multi-Task Learning

Paper
Code

Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers

1 code implementation • 7 Feb 2024 • Abhimanyu Rajeshkumar Bambhaniya, Amir Yazdanbakhsh, Suvinay Subramanian, Sheng-Chun Kao, Shivani Agrawal, Utku Evci, Tushar Krishna

In this work, we study the effectiveness of existing sparse training recipes at \textit{high-sparsity regions} and argue that these methods fail to sustain the model quality on par with low-sparsity regions.

Paper
Code

GANAX: A Unified MIMD-SIMD Acceleration for Generative Adversarial Networks

no code implementations • 10 May 2018 • Amir Yazdanbakhsh, Hajar Falahati, Philip J. Wolfe, Kambiz Samadi, Nam Sung Kim, Hadi Esmaeilzadeh

Even though there is a convolution stage in this operator, the inserted zeros lead to underutilization of the compute resources when a conventional convolution accelerator is employed.

Paper
Add Code

ReLeQ: A Reinforcement Learning Approach for Deep Quantization of Neural Networks

no code implementations • 5 Nov 2018 • Ahmed T. Elthakeb, Prannoy Pilligundla, FatemehSadat Mireshghallah, Amir Yazdanbakhsh, Hadi Esmaeilzadeh

We show how ReLeQ can balance speed and quality, and provide an asymmetric general solution for quantization of a large variety of deep networks (AlexNet, CIFAR-10, LeNet, MobileNet-V1, ResNet-20, SVHN, and VGG-11) that virtually preserves the accuracy (=< 0. 3% loss) while minimizing the computation and storage cost.

Quantization reinforcement-learning +1

Paper
Add Code

Mixed-Signal Charge-Domain Acceleration of Deep Neural networks through Interleaved Bit-Partitioned Arithmetic

no code implementations • 27 Jun 2019 • Soroush Ghodrati, Hardik Sharma, Sean Kinzer, Amir Yazdanbakhsh, Kambiz Samadi, Nam Sung Kim, Doug Burger, Hadi Esmaeilzadeh

Low-power potential of mixed-signal design makes it an alluring option to accelerate Deep Neural Networks (DNNs).

Hardware Architecture

Paper
Add Code

TPO: TREE SEARCH POLICY OPTIMIZATION FOR CONTINUOUS ACTION SPACES

no code implementations • ICLR 2020 • Amir Yazdanbakhsh, Ebrahim Songhori, Robert Ormandi, Anna Goldie, Azalia Mirhoseini

In our experiments, we use PPO as our baseline policy optimization algorithm.

Paper
Add Code

Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation

1 code implementation • ICLR 2020 • Byung Hoon Ahn, Prannoy Pilligundla, Amir Yazdanbakhsh, Hadi Esmaeilzadeh

This solution dubbed Chameleon leverages reinforcement learning whose solution takes fewer steps to converge, and develops an adaptive sampling algorithm that not only focuses on the costly samples (real hardware measurements) on representative points but also uses a domain-knowledge inspired logic to improve the samples itself.

Paper
Code

NAHAS: Neural Architecture and Hardware Accelerator Search

no code implementations • 1 Jan 2021 • Yanqi Zhou, Xuanyi Dong, Daiyi Peng, Ethan Zhu, Amir Yazdanbakhsh, Berkin Akin, Mingxing Tan, James Laudon

In this paper, we study the importance of co-designing neural architectures and hardware accelerators.

Neural Architecture Search

Paper
Add Code

Apollo: Transferable Architecture Exploration

no code implementations • 2 Feb 2021 • Amir Yazdanbakhsh, Christof Angermueller, Berkin Akin, Yanqi Zhou, Albin Jones, Milad Hashemi, Kevin Swersky, Satrajit Chatterjee, Ravi Narayanaswami, James Laudon

We further show that by transferring knowledge between target architectures with different design constraints, Apollo is able to find optimal configurations faster and often with better objective value (up to 25% improvements).

Paper
Add Code

Rethinking Co-design of Neural Architectures and Hardware Accelerators

no code implementations • 17 Feb 2021 • Yanqi Zhou, Xuanyi Dong, Berkin Akin, Mingxing Tan, Daiyi Peng, Tianjian Meng, Amir Yazdanbakhsh, Da Huang, Ravi Narayanaswami, James Laudon

In our work, we target the optimization of hardware and software configurations on an industry-standard edge accelerator.

Neural Architecture Search

Paper
Add Code

FLAT: An Optimized Dataflow for Mitigating Attention Bottlenecks

no code implementations • 13 Jul 2021 • Sheng-Chun Kao, Suvinay Subramanian, Gaurav Agrawal, Amir Yazdanbakhsh, Tushar Krishna

In contrast, FLAT unblocks transformer models for inputs with up to 64K elements

Paper
Add Code

Policy Optimization by Local Improvement through Search

no code implementations • 25 Sep 2019 • Jialin Song, Joe Wenjie Jiang, Amir Yazdanbakhsh, Ebrahim Songhori, Anna Goldie, Navdeep Jaitly, Azalia Mirhoseini

On the other end of the spectrum, approaches rooted in Policy Iteration, such as Dual Policy Iteration do not choose next step actions based on an expert, but instead use planning or search over the policy to choose an action distribution to train towards.

Imitation Learning reinforcement-learning +1

Paper
Add Code

Accelerating Attention through Gradient-Based Learned Runtime Pruning

no code implementations • 7 Apr 2022 • Zheng Li, Soroush Ghodrati, Amir Yazdanbakhsh, Hadi Esmaeilzadeh, Mingu Kang

To best utilize this mathematical innovation, we devise a bit-serial architecture, dubbed LeOPArd, for transformer language models with bit-level early termination microarchitectural mechanism.

Sentence

Paper
Add Code

Sparse Attention Acceleration with Synergistic In-Memory Pruning and On-Chip Recomputation

no code implementations • 1 Sep 2022 • Amir Yazdanbakhsh, Ashkan Moradifirouzabadi, Zheng Li, Mingu Kang

The combined in-memory pruning and on-chip recompute of the relevant attention scores enables SPRINT to transform quadratic complexity to a merely linear one.

Paper
Add Code

Text and Patterns: For Effective Chain of Thought, It Takes Two to Tango

no code implementations • 16 Sep 2022 • Aman Madaan, Amir Yazdanbakhsh

Our empirical and qualitative analysis reveals that a symbiotic relationship between text and patterns explains the success of few-shot prompting: text helps extract commonsense from the question to help patterns, and patterns enforce task understanding and direct text generation.

Code Completion counterfactual +1

Paper
Add Code

Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask

no code implementations • 15 Sep 2022 • Sheng-Chun Kao, Amir Yazdanbakhsh, Suvinay Subramanian, Shivani Agrawal, Utku Evci, Tushar Krishna

In this work, we focus on N:M sparsity and extensively study and evaluate various training recipes for N:M sparsity in terms of the trade-off between model accuracy and compute cost (FLOPs).

Paper
Add Code

STEP: Learning N:M Structured Sparsity Masks from Scratch with Precondition

no code implementations • 2 Feb 2023 • Yucheng Lu, Shivani Agrawal, Suvinay Subramanian, Oleg Rybakov, Christopher De Sa, Amir Yazdanbakhsh

Recent innovations on hardware (e. g. Nvidia A100) have motivated learning N:M structured sparsity masks from scratch for fast model inference.

Machine Translation

Paper
Add Code

USM-Lite: Quantization and Sparsity Aware Fine-tuning for Speech Recognition with Universal Speech Models

no code implementations • 13 Dec 2023 • Shaojin Ding, David Qiu, David Rim, Yanzhang He, Oleg Rybakov, Bo Li, Rohit Prabhavalkar, Weiran Wang, Tara N. Sainath, Zhonglin Han, Jian Li, Amir Yazdanbakhsh, Shivani Agrawal

We conducted extensive experiments with a 2-billion parameter USM on a large-scale voice search dataset to evaluate our proposed method.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

DaCapo: Accelerating Continuous Learning in Autonomous Systems for Video Analytics

no code implementations • 21 Mar 2024 • Yoonsung Kim, Changhun Oh, Jinwoo Hwang, Wonung Kim, Seongryong Oh, Yubin Lee, Hardik Sharma, Amir Yazdanbakhsh, Jongse Park

Deep neural network (DNN) video analytics is crucial for autonomous systems such as self-driving vehicles, unmanned aerial vehicles (UAVs), and security robots.

Paper
Add Code

Tao: Re-Thinking DL-based Microarchitecture Simulation

no code implementations • 16 Apr 2024 • Santosh Pandey, Amir Yazdanbakhsh, Hang Liu

Microarchitecture simulators are indispensable tools for microarchitecture designers to validate, estimate, and optimize new hardware that meets specific design requirements.

Transfer Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.