Search Results for author: Mahdi Soltanolkotabi

Found 39 papers, 9 papers with code

Compressive sensing with un-trained neural networks: Gradient descent finds a smooth approximation

1 code implementation ICML 2020 Reinhard Heckel, Mahdi Soltanolkotabi

For signal recovery from a few measurements, however, un-trained convolutional networks have an intriguing self-regularizing property: Even though the network can perfectly fit any image, the network recovers a natural image from few measurements when trained with gradient descent until convergence.

Compressive Sensing Denoising

Small random initialization is akin to spectral learning: Optimization and generalization guarantees for overparameterized low-rank matrix reconstruction

no code implementations28 Jun 2021 Dominik Stöger, Mahdi Soltanolkotabi

Recently there has been significant theoretical progress on understanding the convergence and generalization of gradient-based methods on nonconvex losses with overparameterized models.

Generalization Guarantees for Neural Architecture Search with Train-Validation Split

no code implementations29 Apr 2021 Samet Oymak, Mingchen Li, Mahdi Soltanolkotabi

In this approach, it is common to use bilevel optimization where one optimizes the model weights over the training data (lower-level problem) and various hyperparameters such as the configuration of the architecture over the validation data (upper-level problem).

bilevel optimization Generalization Bounds +1

Understanding Overparameterization in Generative Adversarial Networks

no code implementations12 Apr 2021 Yogesh Balaji, Mohammadmahdi Sajedi, Neha Mukund Kalibhat, Mucong Ding, Dominik Stöger, Mahdi Soltanolkotabi, Soheil Feizi

We also empirically study the role of model overparameterization in GANs using several large-scale experiments on CIFAR-10 and Celeb-A datasets.

PipeTransformer: Automated Elastic Pipelining for Distributed Training of Transformers

no code implementations5 Feb 2021 Chaoyang He, Shen Li, Mahdi Soltanolkotabi, Salman Avestimehr

PipeTransformer automatically adjusts the pipelining and data parallelism by identifying and freezing some layers during the training, and instead allocates resources for training of the remaining active layers.

Understanding Over-parameterization in Generative Adversarial Networks

no code implementations ICLR 2021 Yogesh Balaji, Mohammadmahdi Sajedi, Neha Mukund Kalibhat, Mucong Ding, Dominik Stöger, Mahdi Soltanolkotabi, Soheil Feizi

In this work, we present a comprehensive analysis of the importance of model over-parameterization in GANs both theoretically and empirically.

Data augmentation for deep learning based accelerated MRI reconstruction

no code implementations1 Jan 2021 Zalan Fabian, Reinhard Heckel, Mahdi Soltanolkotabi

Inspired by the success of Data Augmentation (DA) for classification problems, in this paper, we propose a pipeline for data augmentation for image reconstruction tasks arising in medical imaging and explore its effectiveness at reducing the required training data in a variety of settings.

Data Augmentation Image Restoration +1

Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View

no code implementations NeurIPS 2020 Christos Thrampoulidis, Samet Oymak, Mahdi Soltanolkotabi

Our theoretical analysis allows us to precisely characterize how the test error varies over different training algorithms, data distributions, problem dimensions as well as number of classes, inter/intra class correlations and class priors.

Classification General Classification

Precise Statistical Analysis of Classification Accuracies for Adversarial Training

no code implementations21 Oct 2020 Adel Javanmard, Mahdi Soltanolkotabi

Despite the wide empirical success of modern machine learning algorithms and models in a multitude of applications, they are known to be highly susceptible to seemingly small indiscernible perturbations to the input data known as adversarial attacks.

Classification General Classification

Minimax Lower Bounds for Transfer Learning with Linear and One-hidden Layer Neural Networks

2 code implementations NeurIPS 2020 Seyed Mohammadreza Mousavi Kalan, Zalan Fabian, A. Salman Avestimehr, Mahdi Soltanolkotabi

In this approach a model trained for a source task, where plenty of labeled training data is available, is used as a starting point for training a model on a related target task with only few labeled training data.

Transfer Learning

Approximation Schemes for ReLU Regression

no code implementations26 May 2020 Ilias Diakonikolas, Surbhi Goel, Sushrut Karmalkar, Adam R. Klivans, Mahdi Soltanolkotabi

We consider the fundamental problem of ReLU regression, where the goal is to output the best fitting ReLU with respect to square loss given access to draws from some unknown distribution.

Compressive sensing with un-trained neural networks: Gradient descent finds the smoothest approximation

1 code implementation7 May 2020 Reinhard Heckel, Mahdi Soltanolkotabi

For signal recovery from a few measurements, however, un-trained convolutional networks have an intriguing self-regularizing property: Even though the network can perfectly fit any image, the network recovers a natural image from few measurements when trained with gradient descent until convergence.

Compressive Sensing Denoising

High-Dimensional Robust Mean Estimation via Gradient Descent

no code implementations ICML 2020 Yu Cheng, Ilias Diakonikolas, Rong Ge, Mahdi Soltanolkotabi

We study the problem of high-dimensional robust mean estimation in the presence of a constant fraction of adversarial outliers.

Precise Tradeoffs in Adversarial Training for Linear Regression

no code implementations24 Feb 2020 Adel Javanmard, Mahdi Soltanolkotabi, Hamed Hassani

Furthermore, we precisely characterize the standard/robust accuracy and the corresponding tradeoff achieved by a contemporary mini-max adversarial training approach in a high-dimensional regime where the number of data points and the parameters of the model grow in proportion to each other.

Convergence and sample complexity of gradient methods for the model-free linear quadratic regulator problem

no code implementations26 Dec 2019 Hesameddin Mohammadi, Armin Zare, Mahdi Soltanolkotabi, Mihailo R. Jovanović

Model-free reinforcement learning attempts to find an optimal control action for an unknown dynamical system by directly searching over the parameter space of controllers.

Denoising and Regularization via Exploiting the Structural Bias of Convolutional Generators

1 code implementation ICLR 2020 Reinhard Heckel, Mahdi Soltanolkotabi

A surprising experiment that highlights this architectural bias towards natural images is that one can remove noise and corruptions from a natural image without using any training data, by simply fitting (via gradient descent) a randomly initialized, over-parameterized convolutional generator to the corrupted image.

Denoising Image Generation

Generalization Guarantees for Neural Networks via Harnessing the Low-rank Structure of the Jacobian

no code implementations12 Jun 2019 Samet Oymak, Zalan Fabian, Mingchen Li, Mahdi Soltanolkotabi

We show that over the information space learning is fast and one can quickly train a model with zero training loss that can also generalize well.

Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks

1 code implementation27 Mar 2019 Mingchen Li, Mahdi Soltanolkotabi, Samet Oymak

In particular, we prove that: (i) In the first few iterations where the updates are still in the vicinity of the initialization gradient descent only fits to the correct labels essentially ignoring the noisy labels.

Towards moderate overparameterization: global convergence guarantees for training shallow neural networks

no code implementations12 Feb 2019 Samet Oymak, Mahdi Soltanolkotabi

However, in practice much more moderate levels of overparameterization seems to be sufficient and in many cases overparameterized models seem to perfectly interpolate the training data as soon as the number of parameters exceed the size of the training data by a constant factor.

Fitting ReLUs via SGD and Quantized SGD

no code implementations19 Jan 2019 Seyed Mohammadreza Mousavi Kalan, Mahdi Soltanolkotabi, A. Salman Avestimehr

Perhaps unexpectedly, we show that QSGD maintains the fast convergence of SGD to a globally optimal model while significantly reducing the communication cost.

Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path?

no code implementations25 Dec 2018 Samet Oymak, Mahdi Soltanolkotabi

In this paper we demonstrate that when the loss has certain properties over a minimally small neighborhood of the initial point, first order methods such as (stochastic) gradient descent have a few intriguing properties: (1) the iterates converge at a geometric rate to a global optima even when the loss is nonconvex, (2) among all global optima of the loss the iterates converge to one with a near minimal distance to the initial point, (3) the iterates take a near direct route from the initial point to this global optima.

Compressed Sensing with Deep Image Prior and Learned Regularization

1 code implementation17 Jun 2018 Dave Van Veen, Ajil Jalal, Mahdi Soltanolkotabi, Eric Price, Sriram Vishwanath, Alexandros G. Dimakis

We propose a novel method for compressed sensing recovery using untrained deep generative models.

Lagrange Coded Computing: Optimal Design for Resiliency, Security and Privacy

no code implementations4 Jun 2018 Qian Yu, Songze Li, Netanel Raviv, Seyed Mohammadreza Mousavi Kalan, Mahdi Soltanolkotabi, Salman Avestimehr

We consider a scenario involving computations over a massive dataset stored distributedly across multiple workers, which is at the core of distributed learning algorithms.

Polynomially Coded Regression: Optimal Straggler Mitigation via Data Encoding

no code implementations24 May 2018 Songze Li, Seyed Mohammadreza Mousavi Kalan, Qian Yu, Mahdi Soltanolkotabi, A. Salman Avestimehr

In particular, PCR requires a recovery threshold that scales inversely proportionally with the amount of computation/storage available at each worker.

End-to-end Learning of a Convolutional Neural Network via Deep Tensor Decomposition

no code implementations16 May 2018 Samet Oymak, Mahdi Soltanolkotabi

In this paper we study the problem of learning the weights of a deep convolutional neural network.

Tensor Decomposition

Fundamental Resource Trade-offs for Encoded Distributed Optimization

no code implementations31 Mar 2018 A. Salman Avestimehr, Seyed Mohammadreza Mousavi Kalan, Mahdi Soltanolkotabi

We also analyze the convergence behavior of iterative encoded optimization algorithms, allowing us to characterize fundamental trade-offs between convergence rate, size of data set, accuracy, computational load (or data redundancy), and straggler toleration in this framework.

Distributed Computing Distributed Optimization

Gradient Methods for Submodular Maximization

no code implementations NeurIPS 2017 Hamed Hassani, Mahdi Soltanolkotabi, Amin Karbasi

Despite the apparent lack of convexity in such functions, we prove that stochastic projected gradient methods can provide strong approximation guarantees for maximizing continuous submodular functions with convex constraints.

Active Learning

Theoretical insights into the optimization landscape of over-parameterized shallow neural networks

no code implementations16 Jul 2017 Mahdi Soltanolkotabi, Adel Javanmard, Jason D. Lee

In this paper we study the problem of learning a shallow artificial neural network that best fits a training data set.

Learning ReLUs via Gradient Descent

no code implementations NeurIPS 2017 Mahdi Soltanolkotabi

In this paper we study the problem of learning Rectified Linear Units (ReLUs) which are functions of the form $max(0,<w, x>)$ with $w$ denoting the weight vector.

Structured signal recovery from quadratic measurements: Breaking sample complexity barriers via nonconvex optimization

no code implementations20 Feb 2017 Mahdi Soltanolkotabi

We focus on the under-determined setting where the number of measurements is significantly smaller than the dimension of the signal ($m<<n$).

Image Reconstruction

Fast and Reliable Parameter Estimation from Nonlinear Observations

no code implementations23 Oct 2016 Samet Oymak, Mahdi Soltanolkotabi

In this paper we study the problem of recovering a structured but unknown parameter ${\bf{\theta}}^*$ from $n$ nonlinear observations of the form $y_i=f(\langle {\bf{x}}_i,{\bf{\theta}}^*\rangle)$ for $i=1, 2,\ldots, n$.

Sharp Time--Data Tradeoffs for Linear Inverse Problems

no code implementations16 Jul 2015 Samet Oymak, Benjamin Recht, Mahdi Soltanolkotabi

We sharply characterize the convergence rate associated with a wide variety of random measurement ensembles in terms of the number of measurements and structural complexity of the signal with respect to the chosen penalty function.

Isometric sketching of any set via the Restricted Isometry Property

no code implementations11 Jun 2015 Samet Oymak, Benjamin Recht, Mahdi Soltanolkotabi

In this paper we show that for the purposes of dimensionality reduction certain class of structured random matrices behave similarly to random Gaussian matrices.

Dimensionality Reduction

Approximate Subspace-Sparse Recovery with Corrupted Data via Constrained $\ell_1$-Minimization

no code implementations23 Dec 2014 Ehsan Elhamifar, Mahdi Soltanolkotabi, Shankar Sastry

High-dimensional data often lie in low-dimensional subspaces corresponding to different classes they belong to.

Robust subspace clustering

no code implementations11 Jan 2013 Mahdi Soltanolkotabi, Ehsan Elhamifar, Emmanuel J. Candès

Subspace clustering refers to the task of finding a multi-subspace representation that best fits a collection of points taken from a high-dimensional space.

Cannot find the paper you are looking for? You can Submit a new open access paper.