Search Results for author: Vahid Partovi Nia

Found 39 papers, 4 papers with code

Mitigating Outlier Activations in Low-Precision Fine-Tuning of Language Models

no code implementations14 Dec 2023 Alireza Ghaffari, Justin Yu, Mahsa Ghazvini Nejad, Masoud Asgharian, Boxing Chen, Vahid Partovi Nia

The benefit of using integers for outlier values is that it enables us to use operator tiling to avoid performing 16-bit integer matrix multiplication to address this problem effectively.

Mathematical Challenges in Deep Learning

no code implementations24 Mar 2023 Vahid Partovi Nia, Guojun Zhang, Ivan Kobyzev, Michael R. Metel, Xinlin Li, Ke Sun, Sobhan Hemati, Masoud Asgharian, Linglong Kong, Wulong Liu, Boxing Chen

Deep models are dominating the artificial intelligence (AI) industry since the ImageNet challenge in 2012.

On the Convergence of Stochastic Gradient Descent in Low-precision Number Formats

no code implementations4 Jan 2023 Matteo Cacciola, Antonio Frangioni, Masoud Asgharian, Alireza Ghaffari, Vahid Partovi Nia

Deep learning models are dominating almost all artificial intelligence tasks such as vision, text, and speech processing.

Training Integer-Only Deep Recurrent Neural Networks

no code implementations22 Dec 2022 Vahid Partovi Nia, Eyyüb Sari, Vanessa Courville, Masoud Asgharian

Recurrent neural networks (RNN) are the backbone of many text and speech applications.

Quantization

EuclidNets: An Alternative Operation for Efficient Inference of Deep Learning Models

no code implementations22 Dec 2022 Xinlin Li, Mariana Parazeres, Adam Oberman, Alireza Ghaffari, Masoud Asgharian, Vahid Partovi Nia

With the advent of deep learning application on edge devices, researchers actively try to optimize their deployments on low-power and restricted memory devices.

Quantization

KronA: Parameter Efficient Tuning with Kronecker Adapter

no code implementations20 Dec 2022 Ali Edalati, Marzieh Tahaei, Ivan Kobyzev, Vahid Partovi Nia, James J. Clark, Mehdi Rezagholizadeh

We apply the proposed methods for fine-tuning T5 on the GLUE benchmark to show that incorporating the Kronecker-based modules can outperform state-of-the-art PET methods.

Language Modelling

SeKron: A Decomposition Method Supporting Many Factorization Structures

no code implementations12 Oct 2022 Marawan Gamal Abdel Hameed, Ali Mosleh, Marzieh S. Tahaei, Vahid Partovi Nia

We validate SeKron for model compression on both high-level and low-level computer vision tasks and find that it outperforms state-of-the-art decomposition methods.

Model Compression Tensor Decomposition

Towards Fine-tuning Pre-trained Language Models with Integer Forward and Backward Propagation

no code implementations20 Sep 2022 Mohammadreza Tayaranian, Alireza Ghaffari, Marzieh S. Tahaei, Mehdi Rezagholizadeh, Masoud Asgharian, Vahid Partovi Nia

Previously researchers were focused on lower bit-width integer data types for the forward propagation of language models to save memory and computation.

DenseShift: Towards Accurate and Efficient Low-Bit Power-of-Two Quantization

1 code implementation ICCV 2023 Xinlin Li, Bang Liu, Rui Heng Yang, Vanessa Courville, Chao Xing, Vahid Partovi Nia

We further propose a sign-scale decomposition design to enhance training efficiency and a low-variance random initialization strategy to improve the model's transfer learning performance.

Quantization Transfer Learning

Rethinking Pareto Frontier for Performance Evaluation of Deep Neural Networks

no code implementations18 Feb 2022 Vahid Partovi Nia, Alireza Ghaffari, Mahdi Zolnouri, Yvon Savaria

We propose to use a multi-dimensional Pareto frontier to re-define the efficiency measure of candidate deep learning models, where several variables such as training cost, inference latency, and accuracy play a relative role in defining a dominant model.

Benchmarking Image Classification

Demystifying and Generalizing BinaryConnect

no code implementations NeurIPS 2021 Tim Dockhorn, YaoLiang Yu, Eyyüb Sari, Mahdi Zolnouri, Vahid Partovi Nia

BinaryConnect (BC) and its many variations have become the de facto standard for neural network quantization.

Quantization

Kronecker Decomposition for GPT Compression

no code implementations ACL 2022 Ali Edalati, Marzieh Tahaei, Ahmad Rashid, Vahid Partovi Nia, James J. Clark, Mehdi Rezagholizadeh

GPT is an auto-regressive Transformer-based pre-trained language model which has attracted a lot of attention in the natural language processing (NLP) domain due to its state-of-the-art performance in several downstream tasks.

Knowledge Distillation Language Modelling +1

Convolutional Neural Network Compression through Generalized Kronecker Product Decomposition

no code implementations29 Sep 2021 Marawan Gamal Abdel Hameed, Marzieh S. Tahaei, Ali Mosleh, Vahid Partovi Nia

Modern Convolutional Neural Network (CNN) architectures, despite their superiority in solving various problems, are generally too large to be deployed on resource constrained edge devices.

Image Classification Knowledge Distillation +1

iRNN: Integer-only Recurrent Neural Network

no code implementations20 Sep 2021 Eyyüb Sari, Vanessa Courville, Vahid Partovi Nia

Deploying RNNs that include layer normalization and attention on integer-only arithmetic is still an open problem.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

$S^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks

1 code implementation NeurIPS 2021 Xinlin Li, Bang Liu, YaoLiang Yu, Wulong Liu, Chunjing Xu, Vahid Partovi Nia

Shift neural networks reduce computation complexity by removing expensive multiplication operations and quantizing continuous weights into low-bit discrete values, which are fast and energy efficient compared to conventional neural networks.

S$^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks

no code implementations NeurIPS 2021 Xinlin Li, Bang Liu, YaoLiang Yu, Wulong Liu, Chunjing Xu, Vahid Partovi Nia

Shift neural networks reduce computation complexity by removing expensive multiplication operations and quantizing continuous weights into low-bit discrete values, which are fast and energy-efficient compared to conventional neural networks.

Tensor train decompositions on recurrent networks

no code implementations9 Jun 2020 Alejandro Murua, Ramchalam Ramakrishnan, Xinlin Li, Rui Heng Yang, Vahid Partovi Nia

Recurrent neural networks (RNN) such as long-short-term memory (LSTM) networks are essential in a multitude of daily live tasks such as speech, language, video, and multimodal learning.

A Causal Direction Test for Heterogeneous Populations

no code implementations8 Jun 2020 Vahid Partovi Nia, Xinlin Li, Masoud Asgharian, Shoubo Hu, Zhitang Chen, Yanhui Geng

Our simulation result show that the proposed adjustment significantly improves the performance of the causal direction test statistic for heterogeneous data.

Clustering Decision Making

Batch Normalization in Quantized Networks

no code implementations29 Apr 2020 Eyyüb Sari, Vahid Partovi Nia

Implementation of quantized neural networks on computing hardware leads to considerable speed up and memory saving.

Importance of Data Loading Pipeline in Training Deep Neural Networks

no code implementations21 Apr 2020 Mahdi Zolnouri, Xinlin Li, Vahid Partovi Nia

Training large-scale deep neural networks is a long, time-consuming operation, often requiring many GPUs to accelerate.

Data Augmentation

Qini-based Uplift Regression

no code implementations28 Nov 2019 Mouloud Belbahri, Alejandro Murua, Olivier Gandouet, Vahid Partovi Nia

We introduce a Qini-based uplift regression model to analyze a large insurance company's retention marketing campaign.

Marketing regression

Random Bias Initialization Improves Quantized Training

no code implementations30 Sep 2019 Xinlin Li, Vahid Partovi Nia

Binary neural networks improve computationally efficiency of deep models with a large margin.

Adaptive Binary-Ternary Quantization

no code implementations26 Sep 2019 Ryan Razani, Grégoire Morin, Vahid Partovi Nia, Eyyüb Sari

Ternary quantization provides a more flexible model and outperforms binary quantization in terms of accuracy, however doubles the memory footprint and increases the computational cost.

Autonomous Vehicles Image Classification +1

Random Bias Initialization Improving Binary Neural Network Training

no code implementations25 Sep 2019 Xinlin Li, Vahid Partovi Nia

Edge intelligence especially binary neural network (BNN) has attracted considerable attention of the artificial intelligence community recently.

Smart Ternary Quantization

no code implementations25 Sep 2019 Gregoire Morin, Ryan Razani, Vahid Partovi Nia, Eyyub Sari

Low bit quantization such as binary and ternary quantization is a common approach to alleviate this resource requirements.

Image Classification Quantization

How Does Batch Normalization Help Binary Training?

no code implementations18 Sep 2019 Eyyüb Sari, Mouloud Belbahri, Vahid Partovi Nia

Binary Neural Networks (BNNs) are difficult to train, and suffer from drop of accuracy.

Quantization

BNN+: Improved Binary Network Training

no code implementations ICLR 2019 Sajad Darabi, Mouloud Belbahri, Matthieu Courbariaux, Vahid Partovi Nia

Binary neural networks (BNN) help to alleviate the prohibitive resource requirements of DNN, where both activations and weights are limited to 1-bit.

Active Learning for High-Dimensional Binary Features

no code implementations5 Feb 2019 Ali Vahdat, Mouloud Belbahri, Vahid Partovi Nia

Erbium-doped fiber amplifier (EDFA) is an optical amplifier/repeater device used to boost the intensity of optical signals being carried through a fiber optic communication system.

Active Learning Management +1

Activation Adaptation in Neural Networks

no code implementations28 Jan 2019 Farnoush Farhadi, Vahid Partovi Nia, Andrea Lodi

Given the activation function, the neural network is trained over the bias and the weight parameters.

Foothill: A Quasiconvex Regularization for Edge Computing of Deep Neural Networks

no code implementations18 Jan 2019 Mouloud Belbahri, Eyyüb Sari, Sajad Darabi, Vahid Partovi Nia

Using a quasiconvex base function in order to construct a binary quantizer helps training binary neural networks (BNNs) and adding noise to the input data or using a concrete regularization function helps to improve generalization error.

Edge-computing General Classification +4

Regularized Binary Network Training

1 code implementation ICLR 2019 Sajad Darabi, Mouloud Belbahri, Matthieu Courbariaux, Vahid Partovi Nia

We propose to improve the binary training method, by introducing a new regularization function that encourages training weights around binary values.

Causal Inference and Mechanism Clustering of A Mixture of Additive Noise Models

1 code implementation NeurIPS 2018 Shoubo Hu, Zhitang Chen, Vahid Partovi Nia, Laiwan Chan, Yanhui Geng

The inference of the causal relationship between a pair of observed variables is a fundamental problem in science, and most existing approaches are based on one single causal model.

Causal Inference Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.