1 code implementation • ICML 2020 • Reinhard Heckel, Mahdi Soltanolkotabi
For signal recovery from a few measurements, however, un-trained convolutional networks have an intriguing self-regularizing property: Even though the network can perfectly fit any image, the network recovers a natural image from few measurements when trained with gradient descent until convergence.
1 code implementation • 2 Jan 2025 • Mohammad Shahab Sepehri, Asal Mehradfar, Mahdi Soltanolkotabi, Salman Avestimehr
Predicting Bitcoin price remains a challenging problem due to the high volatility and complex non-linear dynamics of cryptocurrency markets.
no code implementations • 24 Nov 2024 • Hesameddin Mohammadi, Mohammad Tinati, Stephen Tu, Mahdi Soltanolkotabi, Mihailo R. Jovanović
We demonstrate that the Schur complement to a principal eigenspace of the target matrix is governed by an autonomous system that is decoupled from the rest of the dynamics.
2 code implementations • 23 Sep 2024 • Mohammad Shahab Sepehri, Zalan Fabian, Maryam Soltanolkotabi, Mahdi Soltanolkotabi
Multimodal Large Language Models (MLLMs) have tremendous potential to improve the accuracy, availability, and cost-effectiveness of healthcare by providing automated solutions or serving as aids to medical professionals.
Ranked #1 on
Medical Visual Question Answering
on MediConfusion
1 code implementation • 29 Aug 2024 • Mohammadamin Banayeeanzade, Mahdi Soltanolkotabi, Mohammad Rostami
Despite the wide practical adoption of CL and MTL and extensive literature on both areas, there remains a gap in the theoretical understanding of these methods when used with overparameterized models such as deep neural networks.
no code implementations • 26 Mar 2024 • Mohammad Shahab Sepehri, Zalan Fabian, Mahdi Soltanolkotabi
We propose a novel hierarchical architecture inspired by traditional signal processing principles, that converts the input image into a collection of sequences and processes them in a multi-scale fashion.
no code implementations • 6 Oct 2023 • Sara Fridovich-Keil, Fabrizio Valdivia, Gordon Wetzstein, Benjamin Recht, Mahdi Soltanolkotabi
We show that this approach reduces metal artifacts compared to a commercial reconstruction of a human skull with metal dental crowns.
no code implementations • 5 Oct 2023 • Omar Zamzam, Haleh Akrami, Mahdi Soltanolkotabi, Richard Leahy
In this paper we propose to learn a neural network-based data representation using a loss function that can be used to project the unlabeled data into two (positive and negative) clusters that can be easily identified using simple clustering techniques, effectively emulating the phenomenon observed in low-dimensional settings.
1 code implementation • 12 Sep 2023 • Zalan Fabian, Berk Tınaz, Mahdi Soltanolkotabi
Our framework, Flash-Diffusion, acts as a wrapper that can be combined with any latent diffusion-based baseline solver, imbuing it with sample-adaptivity and acceleration.
no code implementations • 25 Jul 2023 • Yue Niu, Zalan Fabian, Sunwoo Lee, Mahdi Soltanolkotabi, Salman Avestimehr
Quasi-Newton methods still face significant challenges in training large-scale neural networks due to additional compute costs in the Hessian related computations and instability issues in stochastic training.
no code implementations • 13 Jul 2023 • Liam Collins, Hamed Hassani, Mahdi Soltanolkotabi, Aryan Mokhtari, Sanjay Shakkottai
An increasingly popular machine learning paradigm is to pretrain a neural network (NN) on many tasks offline, then adapt it to downstream tasks, often by re-training only the last linear layer of the network.
no code implementations • 2 Jul 2023 • Sara Babakniya, Zalan Fabian, Chaoyang He, Mahdi Soltanolkotabi, Salman Avestimehr
Deep learning models are prone to forgetting information learned in the past when trained on new data.
no code implementations • 6 Jun 2023 • Samet Oymak, Ankit Singh Rawat, Mahdi Soltanolkotabi, Christos Thrampoulidis
Despite its success in LLMs, there is limited theoretical understanding of the power of prompt-tuning and the role of the attention mechanism in prompting.
1 code implementation • 25 Mar 2023 • Zalan Fabian, Berk Tınaz, Mahdi Soltanolkotabi
In this work, we propose a novel framework for inverse problem solving, namely we assume that the observation comes from a stochastic degradation process that gradually degrades and noises the original clean image.
no code implementations • 24 Mar 2023 • Mahdi Soltanolkotabi, Dominik Stöger, Changzhi Xie
We show that in this setting, factorized gradient descent enjoys two implicit properties: (1) coupling of the trajectory of gradient descent where the factors are coupled in various ways throughout the gradient update trajectory and (2) an algorithmic regularization property where the iterates show a propensity towards low-rank models despite the overparameterized nature of the factorized model.
1 code implementation • CVPR 2023 • Vinu Sankar Sadasivan, Mahdi Soltanolkotabi, Soheil Feizi
Here, ERM on the clean training data achieves a clean test accuracy of 80. 66$\%$.
no code implementations • 10 Dec 2022 • Chaoyang He, Shuai Zheng, Aston Zhang, George Karypis, Trishul Chilimbi, Mahdi Soltanolkotabi, Salman Avestimehr
The mixture of Expert (MoE) parallelism is a recent advancement that scales up the model size with constant computational cost.
no code implementations • 18 Sep 2022 • Romain Cosentino, Sarath Shekkizhar, Mahdi Soltanolkotabi, Salman Avestimehr, Antonio Ortega
Self-supervised learning (SSL) has emerged as a desirable paradigm in computer vision due to the inability of supervised models to learn representations that can generalize in domains with limited labels.
no code implementations • 30 Jun 2022 • Alex Damian, Jason D. Lee, Mahdi Soltanolkotabi
Furthermore, in a transfer learning setup where the data distributions in the source and target domain share the same representation $U$ but have different polynomial heads we show that a popular heuristic for transfer learning has a target sample complexity independent of $d$.
no code implementations • 13 May 2022 • Romain Cosentino, Anirvan Sengupta, Salman Avestimehr, Mahdi Soltanolkotabi, Antonio Ortega, Ted Willke, Mariano Tepper
When used for transfer learning, the projector is discarded since empirical results show that its representation generalizes more poorly than the encoder's.
no code implementations • 31 Mar 2022 • Stephen Tu, Roy Frostig, Mahdi Soltanolkotabi
Specifically, we establish that the worst-case error rate of this problem is $\Theta(n / m T)$ whenever $m \gtrsim n$.
2 code implementations • 15 Mar 2022 • Zalan Fabian, Berk Tınaz, Mahdi Soltanolkotabi
These models split input images into non-overlapping patches, embed the patches into lower-dimensional tokens and utilize a self-attention mechanism that does not suffer from the aforementioned weaknesses of convolutional architectures.
Ranked #1 on
MRI Reconstruction
on fastMRI Knee 8x
(using extra training data)
1 code implementation • 22 Nov 2021 • Chaoyang He, Alay Dilipbhai Shah, Zhenheng Tang, Di Fan1Adarshan Naiynar Sivashunmugam, Keerti Bhogaraju, Mita Shimpi, Li Shen, Xiaowen Chu, Mahdi Soltanolkotabi, Salman Avestimehr
To bridge the gap and facilitate the development of FL for computer vision tasks, in this work, we propose a federated learning library and benchmarking framework, named FedCV, to evaluate FL on the three most representative computer vision tasks: image classification, image segmentation, and object detection.
no code implementations • 6 Oct 2021 • Chaoyang He, Zhengyu Yang, Erum Mushtaq, Sunwoo Lee, Mahdi Soltanolkotabi, Salman Avestimehr
In this paper we propose self-supervised federated learning (SSFL), a unified self-supervised and personalized federated learning framework, and a series of algorithms under this framework which work towards addressing these challenges.
no code implementations • 29 Sep 2021 • Mohammadreza Mousavi Kalan, Salman Avestimehr, Mahdi Soltanolkotabi
Transfer learning is gaining traction as a promising technique to alleviate this barrier by utilizing the data of a related but different \emph{source} task to compensate for the lack of data in a \emph{target} task where there are few labeled training data.
no code implementations • 29 Sep 2021 • Yue Niu, Zalan Fabian, Sunwoo Lee, Mahdi Soltanolkotabi, Salman Avestimehr
SLIM-QN addresses two key barriers in existing second-order methods for large-scale DNNs: 1) the high computational cost of obtaining the Hessian matrix and its inverse in every iteration (e. g. KFAC); 2) convergence instability due to stochastic training (e. g. L-BFGS).
1 code implementation • 23 Sep 2021 • Yu Cheng, Ilias Diakonikolas, Rong Ge, Shivam Gupta, Daniel M. Kane, Mahdi Soltanolkotabi
We explore the connection between outlier-robust high-dimensional statistics and non-convex optimization in the presence of sparsity constraints, with a focus on the fundamental tasks of robust sparse mean estimation and robust sparse PCA.
2 code implementations • 14 Jul 2021 • Jianyu Wang, Zachary Charles, Zheng Xu, Gauri Joshi, H. Brendan McMahan, Blaise Aguera y Arcas, Maruan Al-Shedivat, Galen Andrew, Salman Avestimehr, Katharine Daly, Deepesh Data, Suhas Diggavi, Hubert Eichner, Advait Gadhikar, Zachary Garrett, Antonious M. Girgis, Filip Hanzely, Andrew Hard, Chaoyang He, Samuel Horvath, Zhouyuan Huo, Alex Ingerman, Martin Jaggi, Tara Javidi, Peter Kairouz, Satyen Kale, Sai Praneeth Karimireddy, Jakub Konecny, Sanmi Koyejo, Tian Li, Luyang Liu, Mehryar Mohri, Hang Qi, Sashank J. Reddi, Peter Richtarik, Karan Singhal, Virginia Smith, Mahdi Soltanolkotabi, Weikang Song, Ananda Theertha Suresh, Sebastian U. Stich, Ameet Talwalkar, Hongyi Wang, Blake Woodworth, Shanshan Wu, Felix X. Yu, Honglin Yuan, Manzil Zaheer, Mi Zhang, Tong Zhang, Chunxiang Zheng, Chen Zhu, Wennan Zhu
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection.
2 code implementations • 28 Jun 2021 • Zalan Fabian, Reinhard Heckel, Mahdi Soltanolkotabi
Deep neural networks have emerged as very successful tools for image restoration and reconstruction tasks.
no code implementations • NeurIPS 2021 • Dominik Stöger, Mahdi Soltanolkotabi
Recently there has been significant theoretical progress on understanding the convergence and generalization of gradient-based methods on nonconvex losses with overparameterized models.
no code implementations • 29 Apr 2021 • Samet Oymak, Mingchen Li, Mahdi Soltanolkotabi
In this approach, it is common to use bilevel optimization where one optimizes the model weights over the training data (inner problem) and various hyperparameters such as the configuration of the architecture over the validation data (outer problem).
1 code implementation • Findings (NAACL) 2022 • Bill Yuchen Lin, Chaoyang He, Zihang Zeng, Hulin Wang, Yufen Huang, Christophe Dupuy, Rahul Gupta, Mahdi Soltanolkotabi, Xiang Ren, Salman Avestimehr
Increasing concerns and regulations about data privacy and sparsity necessitate the study of privacy-preserving, decentralized learning methods for natural language processing (NLP) tasks.
no code implementations • 12 Apr 2021 • Yogesh Balaji, Mohammadmahdi Sajedi, Neha Mukund Kalibhat, Mucong Ding, Dominik Stöger, Mahdi Soltanolkotabi, Soheil Feizi
We also empirically study the role of model overparameterization in GANs using several large-scale experiments on CIFAR-10 and Celeb-A datasets.
1 code implementation • 5 Feb 2021 • Chaoyang He, Shen Li, Mahdi Soltanolkotabi, Salman Avestimehr
PipeTransformer automatically adjusts the pipelining and data parallelism by identifying and freezing some layers during the training, and instead allocates resources for training of the remaining active layers.
no code implementations • ICLR 2021 • Yogesh Balaji, Mohammadmahdi Sajedi, Neha Mukund Kalibhat, Mucong Ding, Dominik Stöger, Mahdi Soltanolkotabi, Soheil Feizi
In this work, we present a comprehensive analysis of the importance of model over-parameterization in GANs both theoretically and empirically.
no code implementations • 1 Jan 2021 • Zalan Fabian, Reinhard Heckel, Mahdi Soltanolkotabi
Inspired by the success of Data Augmentation (DA) for classification problems, in this paper, we propose a pipeline for data augmentation for image reconstruction tasks arising in medical imaging and explore its effectiveness at reducing the required training data in a variety of settings.
no code implementations • NeurIPS 2020 • Christos Thrampoulidis, Samet Oymak, Mahdi Soltanolkotabi
Our theoretical analysis allows us to precisely characterize how the test error varies over different training algorithms, data distributions, problem dimensions as well as number of classes, inter/intra class correlations and class priors.
no code implementations • 21 Oct 2020 • Adel Javanmard, Mahdi Soltanolkotabi
Despite the wide empirical success of modern machine learning algorithms and models in a multitude of applications, they are known to be highly susceptible to seemingly small indiscernible perturbations to the input data known as \emph{adversarial attacks}.
2 code implementations • NeurIPS 2020 • Seyed Mohammadreza Mousavi Kalan, Zalan Fabian, A. Salman Avestimehr, Mahdi Soltanolkotabi
In this approach a model trained for a source task, where plenty of labeled training data is available, is used as a starting point for training a model on a related target task with only few labeled training data.
no code implementations • L4DC 2020 • Hesameddin Mohammadi, Mihailo R. Jovanovic', Mahdi Soltanolkotabi
Model-free reinforcement learning attempts to find an optimal control action for an unknown dynamical system by directly searching over the parameter space of controllers.
no code implementations • 26 May 2020 • Ilias Diakonikolas, Surbhi Goel, Sushrut Karmalkar, Adam R. Klivans, Mahdi Soltanolkotabi
We consider the fundamental problem of ReLU regression, where the goal is to output the best fitting ReLU with respect to square loss given access to draws from some unknown distribution.
1 code implementation • 7 May 2020 • Reinhard Heckel, Mahdi Soltanolkotabi
For signal recovery from a few measurements, however, un-trained convolutional networks have an intriguing self-regularizing property: Even though the network can perfectly fit any image, the network recovers a natural image from few measurements when trained with gradient descent until convergence.
no code implementations • ICML 2020 • Yu Cheng, Ilias Diakonikolas, Rong Ge, Mahdi Soltanolkotabi
We study the problem of high-dimensional robust mean estimation in the presence of a constant fraction of adversarial outliers.
no code implementations • 24 Feb 2020 • Adel Javanmard, Mahdi Soltanolkotabi, Hamed Hassani
Furthermore, we precisely characterize the standard/robust accuracy and the corresponding tradeoff achieved by a contemporary mini-max adversarial training approach in a high-dimensional regime where the number of data points and the parameters of the model grow in proportion to each other.
no code implementations • 26 Dec 2019 • Hesameddin Mohammadi, Armin Zare, Mahdi Soltanolkotabi, Mihailo R. Jovanović
Model-free reinforcement learning attempts to find an optimal control action for an unknown dynamical system by directly searching over the parameter space of controllers.
1 code implementation • ICLR 2020 • Reinhard Heckel, Mahdi Soltanolkotabi
A surprising experiment that highlights this architectural bias towards natural images is that one can remove noise and corruptions from a natural image without using any training data, by simply fitting (via gradient descent) a randomly initialized, over-parameterized convolutional generator to the corrupted image.
no code implementations • 25 Sep 2019 • Samet Oymak, Zalan Fabian, Mingchen Li, Mahdi Soltanolkotabi
We show that over the information space learning is fast and one can quickly train a model with zero training loss that can also generalize well.
no code implementations • 12 Jun 2019 • Samet Oymak, Zalan Fabian, Mingchen Li, Mahdi Soltanolkotabi
We show that over the information space learning is fast and one can quickly train a model with zero training loss that can also generalize well.
1 code implementation • 27 Mar 2019 • Mingchen Li, Mahdi Soltanolkotabi, Samet Oymak
In particular, we prove that: (i) In the first few iterations where the updates are still in the vicinity of the initialization gradient descent only fits to the correct labels essentially ignoring the noisy labels.
no code implementations • 12 Feb 2019 • Samet Oymak, Mahdi Soltanolkotabi
However, in practice much more moderate levels of overparameterization seems to be sufficient and in many cases overparameterized models seem to perfectly interpolate the training data as soon as the number of parameters exceed the size of the training data by a constant factor.
no code implementations • 19 Jan 2019 • Seyed Mohammadreza Mousavi Kalan, Mahdi Soltanolkotabi, A. Salman Avestimehr
Perhaps unexpectedly, we show that QSGD maintains the fast convergence of SGD to a globally optimal model while significantly reducing the communication cost.
no code implementations • 25 Dec 2018 • Samet Oymak, Mahdi Soltanolkotabi
In this paper we demonstrate that when the loss has certain properties over a minimally small neighborhood of the initial point, first order methods such as (stochastic) gradient descent have a few intriguing properties: (1) the iterates converge at a geometric rate to a global optima even when the loss is nonconvex, (2) among all global optima of the loss the iterates converge to one with a near minimal distance to the initial point, (3) the iterates take a near direct route from the initial point to this global optima.
1 code implementation • 17 Jun 2018 • Dave Van Veen, Ajil Jalal, Mahdi Soltanolkotabi, Eric Price, Sriram Vishwanath, Alexandros G. Dimakis
We propose a novel method for compressed sensing recovery using untrained deep generative models.
no code implementations • 4 Jun 2018 • Qian Yu, Songze Li, Netanel Raviv, Seyed Mohammadreza Mousavi Kalan, Mahdi Soltanolkotabi, Salman Avestimehr
We consider a scenario involving computations over a massive dataset stored distributedly across multiple workers, which is at the core of distributed learning algorithms.
no code implementations • 24 May 2018 • Songze Li, Seyed Mohammadreza Mousavi Kalan, Qian Yu, Mahdi Soltanolkotabi, A. Salman Avestimehr
In particular, PCR requires a recovery threshold that scales inversely proportionally with the amount of computation/storage available at each worker.
no code implementations • 16 May 2018 • Samet Oymak, Mahdi Soltanolkotabi
In this paper we study the problem of learning the weights of a deep convolutional neural network.
no code implementations • 31 Mar 2018 • A. Salman Avestimehr, Seyed Mohammadreza Mousavi Kalan, Mahdi Soltanolkotabi
We also analyze the convergence behavior of iterative encoded optimization algorithms, allowing us to characterize fundamental trade-offs between convergence rate, size of data set, accuracy, computational load (or data redundancy), and straggler toleration in this framework.
no code implementations • NeurIPS 2017 • Hamed Hassani, Mahdi Soltanolkotabi, Amin Karbasi
Despite the apparent lack of convexity in such functions, we prove that stochastic projected gradient methods can provide strong approximation guarantees for maximizing continuous submodular functions with convex constraints.
no code implementations • 16 Jul 2017 • Mahdi Soltanolkotabi, Adel Javanmard, Jason D. Lee
In this paper we study the problem of learning a shallow artificial neural network that best fits a training data set.
no code implementations • NeurIPS 2017 • Mahdi Soltanolkotabi
In this paper we study the problem of learning Rectified Linear Units (ReLUs) which are functions of the form $max(0,<w, x>)$ with $w$ denoting the weight vector.
no code implementations • 20 Feb 2017 • Mahdi Soltanolkotabi
We focus on the under-determined setting where the number of measurements is significantly smaller than the dimension of the signal ($m<<n$).
no code implementations • 23 Oct 2016 • Samet Oymak, Mahdi Soltanolkotabi
In this paper we study the problem of recovering a structured but unknown parameter ${\bf{\theta}}^*$ from $n$ nonlinear observations of the form $y_i=f(\langle {\bf{x}}_i,{\bf{\theta}}^*\rangle)$ for $i=1, 2,\ldots, n$.
no code implementations • 10 Nov 2015 • Li-Hao Yeh, Jonathan Dong, Jingshan Zhong, Lei Tian, Michael Chen, Gongguo Tang, Mahdi Soltanolkotabi, Laura Waller
Both noise (e. g. Poisson noise) and model mis-match errors are shown to scale with intensity.
no code implementations • 16 Jul 2015 • Samet Oymak, Benjamin Recht, Mahdi Soltanolkotabi
We sharply characterize the convergence rate associated with a wide variety of random measurement ensembles in terms of the number of measurements and structural complexity of the signal with respect to the chosen penalty function.
no code implementations • 11 Jun 2015 • Samet Oymak, Benjamin Recht, Mahdi Soltanolkotabi
In this paper we show that for the purposes of dimensionality reduction certain class of structured random matrices behave similarly to random Gaussian matrices.
no code implementations • 23 Dec 2014 • Ehsan Elhamifar, Mahdi Soltanolkotabi, Shankar Sastry
High-dimensional data often lie in low-dimensional subspaces corresponding to different classes they belong to.
no code implementations • 11 Jan 2013 • Mahdi Soltanolkotabi, Ehsan Elhamifar, Emmanuel J. Candès
Subspace clustering refers to the task of finding a multi-subspace representation that best fits a collection of points taken from a high-dimensional space.