Search Results for author: Ali Ghodsi

Found 72 papers, 17 papers with code

RW-KD: Sample-wise Loss Terms Re-Weighting for Knowledge Distillation

no code implementations Findings (EMNLP) 2021 Peng Lu, Abbas Ghaddar, Ahmad Rashid, Mehdi Rezagholizadeh, Ali Ghodsi, Philippe Langlais

Knowledge Distillation (KD) is extensively used in Natural Language Processing to compress the pre-training and task-specific fine-tuning phases of large neural language models.

Knowledge Distillation

Learning Chemotherapy Drug Action via Universal Physics-Informed Neural Networks

no code implementations11 Apr 2024 Lena Podina, Ali Ghodsi, Mohammad Kohandel

Quantitative systems pharmacology (QSP) is widely used to assess drug effects and toxicity before the drug goes to clinical trial.

Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling

no code implementations28 Feb 2024 Mahdi Karami, Ali Ghodsi

In the rapidly evolving landscape of deep learning, the quest for models that balance expressivity with computational efficiency has never been more critical.

Computational Efficiency Image Classification +2

WERank: Towards Rank Degradation Prevention for Self-Supervised Learning Using Weight Regularization

no code implementations14 Feb 2024 Ali Saheb Pasand, Reza Moravej, Mahdi Biparva, Ali Ghodsi

A common phenomena confining the representation quality in Self-Supervised Learning (SSL) is dimensional collapse (also known as rank degeneration), where the learned representations are mapped to a low dimensional subspace of the representation space.

Data Augmentation Self-Supervised Learning

Scalable Graph Self-Supervised Learning

no code implementations14 Feb 2024 Ali Saheb Pasand, Reza Moravej, Mahdi Biparva, Raika Karimi, Ali Ghodsi

Our experiments demonstrate that the cost associated with the loss computation can be reduced via node or dimension sampling without lowering the downstream performance.

Self-Supervised Learning

Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference

no code implementations16 Sep 2023 Parsa Kavehzadeh, Mojtaba Valipour, Marzieh Tahaei, Ali Ghodsi, Boxing Chen, Mehdi Rezagholizadeh

We extend SortedNet to generative NLP tasks, making large language models dynamic without any Pre-Training and by only replacing Standard Fine-Tuning (SFT) with Sorted Fine-Tuning (SoFT).

Instruction Following Question Answering +1

SortedNet, a Place for Every Network and Every Network in its Place: Towards a Generalized Solution for Training Many-in-One Neural Networks

no code implementations1 Sep 2023 Mojtaba Valipour, Mehdi Rezagholizadeh, Hossein Rajabzadeh, Parsa Kavehzadeh, Marzieh Tahaei, Boxing Chen, Ali Ghodsi

Deep neural networks (DNNs) must cater to a variety of users with different performance needs and budgets, leading to the costly practice of training, storing, and maintaining numerous specific models.

Image Classification Model Selection

Recurrent Neural Networks and Long Short-Term Memory Networks: Tutorial and Survey

no code implementations22 Apr 2023 Benyamin Ghojogh, Ali Ghodsi

Then, we introduce LSTM gates and cells, history and variants of LSTM, and Gated Recurrent Units (GRU).

Language Modelling Speech Recognition

Improved knowledge distillation by utilizing backward pass knowledge in neural networks

no code implementations27 Jan 2023 Aref Jafari, Mehdi Rezagholizadeh, Ali Ghodsi

Augmenting the training set by adding this auxiliary improves the performance of KD significantly and leads to a closer match between the student and the teacher.

Knowledge Distillation Model Compression

Continuation KD: Improved Knowledge Distillation through the Lens of Continuation Optimization

no code implementations12 Dec 2022 Aref Jafari, Ivan Kobyzev, Mehdi Rezagholizadeh, Pascal Poupart, Ali Ghodsi

Knowledge Distillation (KD) has been extensively used for natural language understanding (NLU) tasks to improve a small model's (a student) generalization by transferring the knowledge from a larger model (a teacher).

Knowledge Distillation Natural Language Understanding

DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic Search-Free Low-Rank Adaptation

2 code implementations14 Oct 2022 Mojtaba Valipour, Mehdi Rezagholizadeh, Ivan Kobyzev, Ali Ghodsi

Our DyLoRA method trains LoRA blocks for a range of ranks instead of a single rank by sorting the representation learned by the adapter module at different ranks during training.

Natural Language Understanding Text Generation

Do we need Label Regularization to Fine-tune Pre-trained Language Models?

no code implementations25 May 2022 Ivan Kobyzev, Aref Jafari, Mehdi Rezagholizadeh, Tianda Li, Alan Do-Omri, Peng Lu, Pascal Poupart, Ali Ghodsi

Knowledge Distillation (KD) is a prominent neural model compression technique that heavily relies on teacher network predictions to guide the training of a student model.

Knowledge Distillation Model Compression

When Chosen Wisely, More Data Is What You Need: A Universal Sample-Efficient Strategy For Data Augmentation

1 code implementation Findings (ACL) 2022 Ehsan Kamalloo, Mehdi Rezagholizadeh, Ali Ghodsi

From a pre-generated pool of augmented samples, Glitter adaptively selects a subset of worst-case samples with maximal loss, analogous to adversarial DA.

Data Augmentation Knowledge Distillation

Spectral, Probabilistic, and Deep Metric Learning: Tutorial and Survey

no code implementations23 Jan 2022 Benyamin Ghojogh, Ali Ghodsi, Fakhri Karray, Mark Crowley

In deep learning methods, we first introduce reconstruction autoencoders and supervised loss functions for metric learning.

Dimensionality Reduction Metric Learning

Generative Adversarial Networks and Adversarial Autoencoders: Tutorial and Survey

no code implementations26 Nov 2021 Benyamin Ghojogh, Ali Ghodsi, Fakhri Karray, Mark Crowley

Finally, we explain the autoencoders based on adversarial learning including adversarial autoencoder, PixelGAN, and implicit autoencoder.

Dimensionality Reduction Face Generation +4

Uniform Manifold Approximation and Projection (UMAP) and its Variants: Tutorial and Survey

no code implementations25 Aug 2021 Benyamin Ghojogh, Ali Ghodsi, Fakhri Karray, Mark Crowley

We start with UMAP algorithm where we explain probabilities of neighborhood in the input and embedding spaces, optimization of cost function, training algorithm, derivation of gradients, and supervised and semi-supervised embedding by UMAP.

Data Visualization Dimensionality Reduction

Unified Framework for Spectral Dimensionality Reduction, Maximum Variance Unfolding, and Kernel Learning By Semidefinite Programming: Tutorial and Survey

no code implementations29 Jun 2021 Benyamin Ghojogh, Ali Ghodsi, Fakhri Karray, Mark Crowley

This is a tutorial and survey paper on unification of spectral dimensionality reduction methods, kernel learning by Semidefinite Programming (SDP), Maximum Variance Unfolding (MVU) or Semidefinite Embedding (SDE), and its variants.

Dimensionality Reduction

SymbolicGPT: A Generative Transformer Model for Symbolic Regression

2 code implementations27 Jun 2021 Mojtaba Valipour, Bowen You, Maysum Panju, Ali Ghodsi

Symbolic regression is the task of identifying a mathematical expression that best fits a provided dataset of input and output values.

Language Modelling regression +1

Legendre Deep Neural Network (LDNN) and its application for approximation of nonlinear Volterra Fredholm Hammerstein integral equations

no code implementations27 Jun 2021 Zeinab Hajimohammadi, Kourosh Parand, Ali Ghodsi

In this paper, we propose Legendre Deep Neural Network (LDNN) for solving nonlinear Volterra Fredholm Hammerstein integral equations (VFHIEs).

Annealing Knowledge Distillation

1 code implementation EACL 2021 Aref Jafari, Mehdi Rezagholizadeh, Pranav Sharma, Ali Ghodsi

Knowledge distillation (KD) is a prominent model compression technique for deep neural networks in which the knowledge of a trained large teacher model is transferred to a smaller student model.

Image Classification Knowledge Distillation +1

Generative Locally Linear Embedding

1 code implementation4 Apr 2021 Benyamin Ghojogh, Ali Ghodsi, Fakhri Karray, Mark Crowley

In this work, we propose two novel generative versions of LLE, named Generative LLE (GLLE), whose linear reconstruction steps are stochastic rather than deterministic.

Dimensionality Reduction Variational Inference

Legendre Deep Neural Network (LDNN) and its application for approximation of nonlinear Volterra–Fredholm–Hammerstein integral equations

no code implementations1 Jan 2021 Kourosh Parand, Zeinab Hajimohammadi, Ali Ghodsi

In particular, Volterra–Fredholm–Hammerstein integral equations are the main type of these integral equations and researchers are interested in investigating and solving these equations.

Locally Linear Embedding and its Variants: Tutorial and Survey

1 code implementation22 Nov 2020 Benyamin Ghojogh, Ali Ghodsi, Fakhri Karray, Mark Crowley

In this paper, we first cover LLE, kernel LLE, inverse LLE, and feature fusion with LLE.

Dimensionality Reduction

Attention Mechanism, Transformers, BERT, and GPT: Tutorial and Survey

no code implementations17 Nov 2020 Benyamin Ghojogh, Ali Ghodsi

Thereafter, we introduce the Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-trained Transformer (GPT) as the stacks of encoders and decoders of transformer, respectively.

Deep Attention Natural Language Inference +1

Symbolically Solving Partial Differential Equations using Deep Learning

no code implementations12 Nov 2020 Maysum Panju, Kourosh Parand, Ali Ghodsi

We describe a neural-based method for generating exact or approximate solutions to differential equations in the form of mathematical expressions.

A Neuro-Symbolic Method for Solving Differential and Functional Equations

no code implementations4 Nov 2020 Maysum Panju, Ali Ghodsi

When neural networks are used to solve differential equations, they usually produce solutions in the form of black-box functions that are not directly mathematically interpretable.

Language Modelling valid

Stochastic Neighbor Embedding with Gaussian and Student-t Distributions: Tutorial and Survey

1 code implementation22 Sep 2020 Benyamin Ghojogh, Ali Ghodsi, Fakhri Karray, Mark Crowley

Stochastic Neighbor Embedding (SNE) is a manifold learning and dimensionality reduction method with a probabilistic approach.

Dimensionality Reduction

Segmentation Approach for Coreference Resolution Task

no code implementations30 Jun 2020 Aref Jafari, Ali Ghodsi

This has been accomplished by defining an embedding method for the position of all members of a coreference cluster in a document and resolving all of them for a given mention.

coreference-resolution Position +1

DeepNovoV2: Better de novo peptide sequencing with deep learning

1 code implementation17 Apr 2019 Rui Qiao, Ngoc Hieu Tran, Lei Xin, Baozhen Shan, Ming Li, Ali Ghodsi

Personalized cancer vaccines are envisioned as the next generation rational cancer immunotherapy.

de novo peptide sequencing

Deep Variational Sufficient Dimensionality Reduction

no code implementations18 Dec 2018 Ershad Banijamali, Amir-Hossein Karimi, Ali Ghodsi

We consider the problem of sufficient dimensionality reduction (SDR), where the high-dimensional observation is transformed to a low-dimensional sub-space in which the information of the observations regarding the label variable is preserved.

Dimensionality Reduction General Classification

SRP: Efficient class-aware embedding learning for large-scale data via supervised random projections

1 code implementation7 Nov 2018 Amir-Hossein Karimi, Alexander Wong, Ali Ghodsi

While stochastic approximation strategies have been explored for unsupervised dimensionality reduction to tackle this challenge, such approaches are not well-suited for accelerating computational speed for supervised dimensionality reduction.

Supervised dimensionality reduction

Text Classification based on Multiple Block Convolutional Highways

no code implementations23 Jul 2018 Seyed Mahdi Rezaeinia, Ali Ghodsi, Rouhollah Rahmani

In the Text Classification areas of Sentiment Analysis, Subjectivity/Objectivity Analysis, and Opinion Polarity, Convolutional Neural Networks have gained special attention because of their performance and accuracy.

General Classification Sentiment Analysis +2

A Berkeley View of Systems Challenges for AI

no code implementations15 Dec 2017 Ion Stoica, Dawn Song, Raluca Ada Popa, David Patterson, Michael W. Mahoney, Randy Katz, Anthony D. Joseph, Michael Jordan, Joseph M. Hellerstein, Joseph E. Gonzalez, Ken Goldberg, Ali Ghodsi, David Culler, Pieter Abbeel

With the increasing commoditization of computer vision, speech recognition and machine translation systems and the widespread deployment of learning-based back-end technologies such as digital advertising and intelligent infrastructures, AI (Artificial Intelligence) has moved from research labs to production.

Machine Translation speech-recognition +1

Disentangling Dynamics and Content for Control and Planning

no code implementations24 Nov 2017 Ershad Banijamali, Ahmad Khajenezhad, Ali Ghodsi, Mohammad Ghavamzadeh

In this paper, We study the problem of learning a controllable representation for high-dimensional observations of dynamical systems.

JADE: Joint Autoencoders for Dis-Entanglement

no code implementations24 Nov 2017 Ershad Banijamali, Amir-Hossein Karimi, Alexander Wong, Ali Ghodsi

The problem of feature disentanglement has been explored in the literature, for the purpose of image and video processing and text analysis.

Disentanglement General Classification

Improving the Accuracy of Pre-trained Word Embeddings for Sentiment Analysis

1 code implementation23 Nov 2017 Seyed Mahdi Rezaeinia, Ali Ghodsi, Rouhollah Rahmani

In this paper we propose a novel method, Improved Word Vectors (IWV), which increases the accuracy of pre-trained word embeddings in sentiment analysis.

Marketing Part-Of-Speech Tagging +5

Robust Locally-Linear Controllable Embedding

no code implementations15 Oct 2017 Ershad Banijamali, Rui Shu, Mohammad Ghavamzadeh, Hung Bui, Ali Ghodsi

We also propose a principled variational approximation of the embedding posterior that takes the future observation into account, and thus, makes the variational approximation more robust against the noise.

Deep Structure for end-to-end inverse rendering

no code implementations25 Aug 2017 Shima Kamyab, Ali Ghodsi, S. Zohreh Azimifar

Inverse rendering in a 3D format denoted to recovering the 3D properties of a scene given 2D input image(s) and is typically done using 3D Morphable Model (3DMM) based methods from single view images.

Inverse Rendering

Fast Spectral Clustering Using Autoencoders and Landmarks

no code implementations7 Apr 2017 Ershad Banijamali, Ali Ghodsi

Spectral clustering is a powerful clustering algorithm that suffers from high computational complexity, due to eigen decomposition.

Clustering

Generative Mixture of Networks

no code implementations10 Feb 2017 Ershad Banijamali, Ali Ghodsi, Pascal Poupart

The model consists of K networks that are trained together to learn the underlying distribution of a given data set.

Clustering

Semi-Supervised Representation Learning based on Probabilistic Labeling

no code implementations10 May 2016 Ershad Banijamali, Ali Ghodsi

Then, we map the data to lower-dimensional space using a linear transformation such that the dependency between the transformed data and the assigned labels is maximized.

Representation Learning

Semi-supervised Dictionary Learning Based on Hilbert-Schmidt Independence Criterion

no code implementations25 Apr 2016 Mehrdad J. Gangeh, Safaa M. A. Bedawi, Ali Ghodsi, Fakhri Karray

The proposed method benefits from the supervisory information by learning the dictionary in a space where the dependency between the data and class labels is maximized.

Dictionary Learning

On the Invariance of Dictionary Learning and Sparse Representation to Projecting Data to a Discriminative Space

no code implementations6 Mar 2015 Mehrdad J. Gangeh, Ali Ghodsi

In this paper, it is proved that dictionary learning and sparse representation is invariant to a linear transformation.

Dictionary Learning

Supervised Dictionary Learning and Sparse Representation-A Review

no code implementations20 Feb 2015 Mehrdad J. Gangeh, Ahmed K. Farahat, Ali Ghodsi, Mohamed S. Kamel

This review provides a broad, yet deep, view of the state-of-the-art methods for S-DLSR and allows for the advancement of research and development in this emerging area of research.

Denoising Dictionary Learning +1

Greedy Column Subset Selection for Large-scale Data Sets

no code implementations24 Dec 2013 Ahmed K. Farahat, Ahmed Elgohary, Ali Ghodsi, Mohamed S. Kamel

The algorithm first learns a concise representation of all columns using random projection, and it then solves a generalized column subset selection problem at each machine in which a subset of columns are selected from the sub-matrix on that machine such that the reconstruction error of the concise representation is minimized.

A Fast Greedy Algorithm for Generalized Column Subset Selection

no code implementations24 Dec 2013 Ahmed K. Farahat, Ali Ghodsi, Mohamed S. Kamel

This paper defines a generalized column subset selection problem which is concerned with the selection of a few columns from a source matrix A that best approximate the span of a target matrix B.

Highly Available Transactions: Virtues and Limitations (Extended Version)

no code implementations1 Feb 2013 Peter Bailis, Aaron Davidson, Alan Fekete, Ali Ghodsi, Joseph M. Hellerstein, Ion Stoica

To minimize network latency and remain online during server failures and network partitions, many modern distributed data storage systems eschew transactional functionality, which provides strong semantic guarantees for groups of multiple operations over multiple data items.

Databases

Kernelized Supervised Dictionary Learning

no code implementations10 Jul 2012 Mehrdad J. Gangeh, Ali Ghodsi, Mohamed S. Kamel

In this paper, we propose supervised dictionary learning (SDL) by incorporating information on class labels into the learning of the dictionary.

Dictionary Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.