Search Results for author: Massoud Pedram

Found 44 papers, 7 papers with code

Scalable Superconductor Neuron with Ternary Synaptic Connections for Ultra-Fast SNN Hardware

no code implementations • 26 Feb 2024 • Mustafa Altay Karamuftuoglu, Beyza Zeynep Ucpinar, Arash Fayyazi, Sasan Razmkhah, Mehdi Kamal, Massoud Pedram

A novel high-fan-in differential superconductor neuron structure designed for ultra-high-performance Spiking Neural Network (SNN) accelerators is presented.

4k Efficient Neural Network

Paper
Add Code

Memory-Efficient Vision Transformers: An Activation-Aware Mixed-Rank Compression Strategy

no code implementations • 8 Feb 2024 • Seyedarmin Azizi, Mahdi Nazemi, Massoud Pedram

This paper addresses this memory limitation by introducing an activation-aware model compression methodology that uses selective low-rank weight tensor approximations of different layers to reduce the parameter count of ViTs.

Model Compression

Paper
Add Code

Low-Precision Mixed-Computation Models for Inference on Edge

no code implementations • 3 Dec 2023 • Seyedarmin Azizi, Mahdi Nazemi, Mehdi Kamal, Massoud Pedram

This paper presents a mixed-computation neural network processing approach for edge applications that incorporates low-precision (low-width) Posit and low-precision fixed point (FixP) number systems.

Quantization

Paper
Add Code

An On-Chip Trainable Neuron Circuit for SFQ-Based Spiking Neural Networks

no code implementations • 11 Oct 2023 • Beyza Zeynep Ucpinar, Mustafa Altay Karamuftuoglu, Sasan Razmkhah, Massoud Pedram

We present an on-chip trainable neuron circuit.

Paper
Add Code

Sensitivity-Aware Mixed-Precision Quantization and Width Optimization of Deep Neural Networks Through Cluster-Based Tree-Structured Parzen Estimation

no code implementations • 12 Aug 2023 • Seyedarmin Azizi, Mahdi Nazemi, Arash Fayyazi, Massoud Pedram

As a result, our proposed method represents a leap forward in neural network design optimization, paving the way for quick model design and implementation in settings with limited resources, thereby propelling the potential of scalable deep learning solutions.

Quantization

Paper
Add Code

Brain Tumor Detection using Convolutional Neural Networks with Skip Connections

no code implementations • 14 Jul 2023 • Aupam Hamran, Marzieh Vaeztourshizi, Amirhossein Esmaili, Massoud Pedram

Different CNN architecture optimization techniques such as widening and deepening of the network and adding skip connections are applied to improve the accuracy of the network.

Paper
Add Code

CrAFT: Compression-Aware Fine-Tuning for Efficient Visual Task Adaptation

no code implementations • 8 May 2023 • Jung Hwan Heo, Seyedarmin Azizi, Arash Fayyazi, Massoud Pedram

Post-training compression techniques such as pruning and quantization can help lower deployment costs.

Model Compression Quantization +1

Paper
Add Code

A Fast Training-Free Compression Framework for Vision Transformers

1 code implementation • 4 Mar 2023 • Jung Hwan Heo, Arash Fayyazi, Mahdi Nazemi, Massoud Pedram

Token pruning has emerged as an effective solution to speed up the inference of large Transformer models.

Paper
Code

Efficient Compilation and Mapping of Fixed Function Combinational Logic onto Digital Signal Processors Targeting Neural Network Inference and Utilizing High-level Synthesis

no code implementations • 30 Jul 2022 • Soheil Nazar Shahsavani, Arash Fayyazi, Mahdi Nazemi, Massoud Pedram

Recent efforts for improving the performance of neural network (NN) accelerators that meet today's application requirements have given rise to a new trend of logic-based NN inference relying on fixed function combinational logic.

Paper
Add Code

Sparse Periodic Systolic Dataflow for Lowering Latency and Power Dissipation of Convolutional Neural Network Accelerators

no code implementations • 30 Jun 2022 • Jung Hwan Heo, Arash Fayyazi, Amirhossein Esmaili, Massoud Pedram

This paper introduces the sparse periodic systolic (SPS) dataflow, which advances the state-of-the-art hardware accelerator for supporting lightweight neural networks.

Paper
Add Code

A Fast and Efficient Conditional Learning for Tunable Trade-Off between Accuracy and Robustness

no code implementations • 28 Mar 2022 • Souvik Kundu, Sairam Sundaresan, Massoud Pedram, Peter A. Beerel

In this paper, we present a fast learnable once-for-all adversarial training (FLOAT) algorithm, which instead of the existing FiLM-based conditioning, presents a unique weight conditioned learning that requires no additional layer, thereby incurring no significant increase in parameter count, training time, or network latency compared to standard adversarial training.

Image Classification

Paper
Add Code

BMPQ: Bit-Gradient Sensitivity Driven Mixed-Precision Quantization of DNNs from Scratch

no code implementations • 24 Dec 2021 • Souvik Kundu, Shikai Wang, Qirui Sun, Peter A. Beerel, Massoud Pedram

Compared to the baseline FP-32 models, BMPQ can yield models that have 15. 4x fewer parameter bits with a negligible drop in accuracy.

Quantization

Paper
Add Code

Analyzing the Confidentiality of Undistillable Teachers in Knowledge Distillation

no code implementations • NeurIPS 2021 • Souvik Kundu, Qirui Sun, Yao Fu, Massoud Pedram, Peter Beerel

Knowledge distillation (KD) has recently been identified as a method that can unintentionally leak private information regarding the details of a teacher model to an unauthorized student.

Knowledge Distillation

Paper
Add Code

HIRE-SNN: Harnessing the Inherent Robustness of Energy-Efficient Deep Spiking Neural Networks by Training with Crafted Input Noise

1 code implementation • ICCV 2021 • Souvik Kundu, Massoud Pedram, Peter A. Beerel

Low-latency deep spiking neural networks (SNNs) have become a promising alternative to conventional artificial neural networks (ANNs) because of their potential for increased energy efficiency on event-driven neuromorphic hardware.

Paper
Code

Towards Low-Latency Energy-Efficient Deep SNNs via Attention-Guided Compression

no code implementations • 16 Jul 2021 • Souvik Kundu, Gourav Datta, Massoud Pedram, Peter A. Beerel

To evaluate the merits of our approach, we performed experiments with variants of VGG and ResNet, on both CIFAR-10 and CIFAR-100, and VGG16 on Tiny-ImageNet. The SNN models generated through the proposed technique yield SOTA compression ratios of up to 33. 4x with no significant drops in accuracy compared to baseline unpruned counterparts.

Sparse Learning

Paper
Add Code

NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function Combinational Logic

no code implementations • 7 Apr 2021 • Mahdi Nazemi, Arash Fayyazi, Amirhossein Esmaili, Atharva Khare, Soheil Nazar Shahsavani, Massoud Pedram

While there is a large body of research on efficient processing of deep neural networks (DNNs), ultra-low-latency realization of these models for applications with stringent, sub-microsecond latency requirements continues to be an unresolved, challenging problem.

Paper
Add Code

A2P-MANN: Adaptive Attention Inference Hops Pruned Memory-Augmented Neural Networks

no code implementations • 24 Jan 2021 • Mohsen Ahmadzadeh, Mehdi Kamal, Ali Afzali-Kusha, Massoud Pedram

In this work, to limit the number of required attention inference hops in memory-augmented neural networks, we propose an online adaptive approach called A2P-MANN.

Question Answering

Paper
Add Code

BRDS: An FPGA-based LSTM Accelerator with Row-Balanced Dual-Ratio Sparsification

no code implementations • 7 Jan 2021 • Seyed Abolfazl Ghasemzadeh, Erfan Bank Tavakoli, Mehdi Kamal, Ali Afzali-Kusha, Massoud Pedram

In this paper, first, a hardware-friendly pruning algorithm for reducing energy consumption and improving the speed of Long Short-Term Memory (LSTM) neural network accelerators is presented.

Sentiment Analysis Sentiment Classification +2

Paper
Add Code

A Tunable Robust Pruning Framework Through Dynamic Network Rewiring of DNNs

1 code implementation • 3 Nov 2020 • Souvik Kundu, Mahdi Nazemi, Peter A. Beerel, Massoud Pedram

This paper presents a dynamic network rewiring (DNR) method to generate pruned deep neural network (DNN) models that are robust against adversarial attacks yet maintain high accuracy on clean images.

Image Classification Model Compression

Paper
Code

SynergicLearning: Neural Network-Based Feature Extraction for Highly-Accurate Hyperdimensional Learning

no code implementations • 30 Jul 2020 • Mahdi Nazemi, Amirhossein Esmaili, Arash Fayyazi, Massoud Pedram

The proposed hybrid machine learning model has the same level of accuracy (i. e. $\pm$1%) as NNs while achieving at least 10% improvement in accuracy compared to HD learning models.

BIG-bench Machine Learning Computational Efficiency

Paper
Add Code

Deep-PowerX: A Deep Learning-Based Framework for Low-Power Approximate Logic Synthesis

1 code implementation • 3 Jul 2020 • Ghasem Pasandi, Mackenzie Peterson, Moises Herrera, Shahin Nazarian, Massoud Pedram

This paper aims at integrating three powerful techniques namely Deep Learning, Approximate Computing, and Low Power Design into a strategy to optimize logic at the synthesis level.

Paper
Code

NN-PARS: A Parallelized Neural Network Based Circuit Simulation Framework

no code implementations • 13 Feb 2020 • Mohammad Saeed Abrishami, Hao Ge, Justin F. Calderon, Massoud Pedram, Shahin Nazarian

The shrinking of transistor geometries as well as the increasing complexity of integrated circuits, significantly aggravate nonlinear design behavior.

Scheduling

Paper
Add Code

CSM-NN: Current Source Model Based Logic Circuit Simulation -- A Neural Network Approach

no code implementations • 13 Feb 2020 • Mohammad Saeed Abrishami, Massoud Pedram, Shahin Nazarian

The miniaturization of transistors down to 5nm and beyond, plus the increasing complexity of integrated circuits, significantly aggravate short channel effects, and demand analysis and optimization of more design corners and modes.

Paper
Add Code

Efficient Training of Deep Convolutional Neural Networks by Augmentation in Embedding Space

no code implementations • 12 Feb 2020 • Mohammad Saeed Abrishami, Amir Erfan Eshratifar, David Eigen, Yanzhi Wang, Shahin Nazarian, Massoud Pedram

However, fine-tuning a transfer model with data augmentation in the raw input space has a high computational cost to run the full network for every augmented input.

Data Augmentation Transfer Learning

Paper
Add Code

Pre-defined Sparsity for Low-Complexity Convolutional Neural Networks

1 code implementation • 29 Jan 2020 • Souvik Kundu, Mahdi Nazemi, Massoud Pedram, Keith M. Chugg, Peter A. Beerel

We also compared the performance of our proposed architectures with that of ShuffleNet andMobileNetV2.

Paper
Code

Runtime Deep Model Multiplexing for Reduced Latency and Energy Consumption Inference

no code implementations • 14 Jan 2020 • Amir Erfan Eshratifar, Massoud Pedram

The proposed algorithm allows the mobile device to detect the inputs that can be processed locally and the ones that require a larger model and should be sent a cloud server.

Paper
Add Code

Energy-aware Scheduling of Jobs in Heterogeneous Cluster Systems Using Deep Reinforcement Learning

no code implementations • 11 Dec 2019 • Amirhossein Esmaili, Massoud Pedram

Energy consumption is one of the most critical concerns in designing computing devices, ranging from portable embedded systems to computer cluster systems.

Management Reinforcement Learning (RL) +1

Paper
Add Code

Coarse2Fine: A Two-stage Training Method for Fine-grained Visual Classification

no code implementations • 6 Sep 2019 • Amir Erfan Eshratifar, David Eigen, Michael Gormish, Massoud Pedram

Small inter-class and large intra-class variations are the main challenges in fine-grained visual classification.

Fine-Grained Image Classification General Classification +1

Paper
Add Code

Optimizing Routerless Network-on-Chip Designs: An Innovative Learning-Based Framework

no code implementations • 11 May 2019 • Ting-Ru Lin, Drew Penney, Massoud Pedram, Lizhong Chen

Machine learning applied to architecture design presents a promising opportunity with broad applications.

Efficient Exploration reinforcement-learning +1

Paper
Add Code

BottleNet: A Deep Learning Architecture for Intelligent Mobile Cloud Computing Services

no code implementations • 4 Feb 2019 • Amir Erfan Eshratifar, Amirhossein Esmaili, Massoud Pedram

Recent studies have shown the latency and energy consumption of deep neural networks can be significantly improved by splitting the network between the mobile device and cloud.

Cloud Computing

Paper
Add Code

Towards Collaborative Intelligence Friendly Architectures for Deep Learning

no code implementations • 1 Feb 2019 • Amir Erfan Eshratifar, Amirhossein Esmaili, Massoud Pedram

In this approach, referred to as collaborative intelligence, intermediate features computed on the mobile device are offloaded to the cloud instead of the raw input data of the network, reducing the size of the data needed to be sent to the cloud.

Distributed, Parallel, and Cluster Computing

Paper
Add Code

Approximate Logic Synthesis: A Reinforcement Learning-Based Technology Mapping Approach

no code implementations • 1 Feb 2019 • Ghasem Pasandi, Shahin Nazarian, Massoud Pedram

Approximate Logic Synthesis (ALS) is the process of synthesizing and mapping a given Boolean network to a library of logic cells so that the magnitude/rate of error between outputs of the approximate and initial (exact) Boolean netlists is bounded from above by a predetermined total error threshold.

Hardware Architecture

Paper
Add Code

Space Expansion of Feature Selection for Designing more Accurate Error Predictors

no code implementations • 30 Dec 2018 • Shayan Tabatabaei Nikkhah, Mehdi Kamal, Ali Afzali-Kusha, Massoud Pedram

The results on various benchmarks demonstrate significant improvements in the prediction accuracy compared to the prior works which used only the accelerator inputs for the prediction.

feature selection Scheduling

Paper
Add Code

Modeling Processor Idle Times in MPSoC Platforms to Enable Integrated DPM, DVFS, and Task Scheduling Subject to a Hard Deadline

1 code implementation • 19 Dec 2018 • Amirhossein Esmaili, Mahdi Nazemi, Massoud Pedram

Energy efficiency is one of the most critical design criteria for modern embedded systems such as multiprocessor system-on-chips (MPSoCs).

Operating Systems Distributed, Parallel, and Cluster Computing

Paper
Code

Gradient Agreement as an Optimization Objective for Meta-Learning

no code implementations • 18 Oct 2018 • Amir Erfan Eshratifar, David Eigen, Massoud Pedram

Therefore, the degree of the contribution of a task to the parameter updates is controlled by introducing a set of weights on the loss function of the tasks.

Meta-Learning

Paper
Add Code

A Meta-Learning Approach for Custom Model Training

no code implementations • 21 Sep 2018 • Amir Erfan Eshratifar, Mohammad Saeed Abrishami, David Eigen, Massoud Pedram

Transfer-learning and meta-learning are two effective methods to apply knowledge learned from large data sources to new tasks.

Meta-Learning Transfer Learning

Paper
Add Code

NullaNet: Training Deep Neural Networks for Reduced-Memory-Access Inference

no code implementations • 23 Jul 2018 • Mahdi Nazemi, Ghasem Pasandi, Massoud Pedram

Deep neural networks have been successfully deployed in a wide variety of applications including computer vision and speech recognition.

speech-recognition Speech Recognition

Paper
Add Code

Deploying Customized Data Representation and Approximate Computing in Machine Learning Applications

no code implementations • 3 Jun 2018 • Mahdi Nazemi, Massoud Pedram

Lop allows researchers and designers to quickly compare quality of their models using various data representations and arithmetic operations in Python and contrast the hardware cost of viable representations by synthesizing them on their target platforms (e. g., FPGA or ASIC).

BIG-bench Machine Learning

Paper
Add Code

VIBNN: Hardware Acceleration of Bayesian Neural Networks

no code implementations • 2 Feb 2018 • Ruizhe Cai, Ao Ren, Ning Liu, Caiwen Ding, Luhao Wang, Xuehai Qian, Massoud Pedram, Yanzhi Wang

In this paper, we propose VIBNN, an FPGA-based hardware accelerator design for variational inference on BNNs.

Small Data Image Classification Variational Inference

Paper
Add Code

JointDNN: An Efficient Training and Inference Engine for Intelligent Mobile Cloud Computing Services

no code implementations • 25 Jan 2018 • Amir Erfan Eshratifar, Mohammad Saeed Abrishami, Massoud Pedram

Deep learning models are being deployed in many mobile intelligent applications.

Cloud Computing

Paper
Add Code

A Hardware-Friendly Algorithm for Scalable Training and Deployment of Dimensionality Reduction Models on FPGA

no code implementations • 11 Jan 2018 • Mahdi Nazemi, Amir Erfan Eshratifar, Massoud Pedram

With ever-increasing application of machine learning models in various domains such as image classification, speech recognition and synthesis, and health care, designing efficient hardware for these models has gained a lot of popularity.

BIG-bench Machine Learning Dimensionality Reduction +4

Paper
Add Code

FFT-Based Deep Learning Deployment in Embedded Systems

no code implementations • 13 Dec 2017 • Sheng Lin, Ning Liu, Mahdi Nazemi, Hongjia Li, Caiwen Ding, Yanzhi Wang, Massoud Pedram

The large model size of DNNs, while providing excellent accuracy, also burdens the embedded platforms with intensive computation and storage.

speech-recognition Speech Recognition

Paper
Add Code

High-Performance FPGA Implementation of Equivariant Adaptive Separation via Independence Algorithm for Independent Component Analysis

no code implementations • 6 Jul 2017 • Mahdi Nazemi, Shahin Nazarian, Massoud Pedram

Independent Component Analysis (ICA) is a dimensionality reduction technique that can boost efficiency of machine learning models that deal with probability density functions, e. g. Bayesian neural networks.

BIG-bench Machine Learning Dimensionality Reduction

Paper
Add Code

HEBS: Histogram Equalization for Backlight Scaling

1 code implementation • 25 Oct 2007 • Ali Iranli, Hanif Fatemi, Massoud Pedram

In this paper, a method is proposed for finding a pixel transformation function that maximizes backlight dimming while maintaining a pre-specified image distortion level for a liquid crystal display.

Other Computer Science

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.