Search Results for author: Hossein Taheri

Found 10 papers, 1 papers with code

On the Optimization and Generalization of Multi-head Attention

no code implementations19 Oct 2023 Puneesh Deora, Rouzbeh Ghaderi, Hossein Taheri, Christos Thrampoulidis

Finally, we demonstrate that these conditions are satisfied for a simple tokenized-mixture model.

Fast Convergence in Learning Two-Layer Neural Networks with Separable Data

no code implementations22 May 2023 Hossein Taheri, Christos Thrampoulidis

Normalized gradient descent has shown substantial success in speeding up the convergence of exponentially-tailed loss functions (which includes exponential and logistic losses) on linear classifiers with separable data.

Generalization Bounds

Generalization and Stability of Interpolating Neural Networks with Minimal Width

no code implementations18 Feb 2023 Hossein Taheri, Christos Thrampoulidis

Specifically, in a realizable scenario where model weights can achieve arbitrarily small training error $\epsilon$ and their distance from initialization is $g(\epsilon)$, we demonstrate that gradient descent with $n$ training data achieves training error $O(g(1/T)^2 /T)$ and generalization error $O(g(1/T)^2 /n)$ at iteration $T$, provided there are at least $m=\Omega(g(1/T)^4)$ hidden neurons.

On Generalization of Decentralized Learning with Separable Data

no code implementations15 Sep 2022 Hossein Taheri, Christos Thrampoulidis

Motivated by overparameterized learning settings, in which models are trained to zero training loss, we study algorithmic and generalization properties of decentralized learning with gradient descent on separable data.

Generalization Bounds

Asymptotic Behavior of Adversarial Training in Binary Classification

no code implementations26 Oct 2020 Hossein Taheri, Ramtin Pedarsani, Christos Thrampoulidis

It has been consistently reported that many machine learning models are susceptible to adversarial attacks i. e., small additive adversarial perturbations applied to data points can cause misclassification.

Binary Classification Classification +1

Fundamental Limits of Ridge-Regularized Empirical Risk Minimization in High Dimensions

no code implementations16 Jun 2020 Hossein Taheri, Ramtin Pedarsani, Christos Thrampoulidis

For a stylized setting with Gaussian features and problem dimensions that grow large at a proportional rate, we start with sharp performance characterizations and then derive tight lower bounds on the estimation and prediction error that hold over a wide class of loss functions and for any value of the regularization parameter.

Vocal Bursts Intensity Prediction

Quantized Decentralized Stochastic Learning over Directed Graphs

no code implementations ICML 2020 Hossein Taheri, Aryan Mokhtari, Hamed Hassani, Ramtin Pedarsani

We consider a decentralized stochastic learning problem where data points are distributed among computing nodes communicating over a directed graph.

Quantization

Sharp Asymptotics and Optimal Performance for Inference in Binary Models

no code implementations17 Feb 2020 Hossein Taheri, Ramtin Pedarsani, Christos Thrampoulidis

We study convex empirical risk minimization for high-dimensional inference in binary models.

Sharp Guarantees for Solving Random Equations with One-Bit Information

no code implementations12 Aug 2019 Hossein Taheri, Ramtin Pedarsani, Christos Thrampoulidis

We study the performance of a wide class of convex optimization-based estimators for recovering a signal from corrupted one-bit measurements in high-dimensions.

Robust and Communication-Efficient Collaborative Learning

1 code implementation NeurIPS 2019 Amirhossein Reisizadeh, Hossein Taheri, Aryan Mokhtari, Hamed Hassani, Ramtin Pedarsani

We consider a decentralized learning problem, where a set of computing nodes aim at solving a non-convex optimization problem collaboratively.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.