Search Results for author: Mert Pilanci

Found 74 papers, 16 papers with code

Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation

no code implementations3 Apr 2024 Aaron Mishkin, Mert Pilanci, Mark Schmidt

This improvement is comparable to a square-root of the condition number in the worst case and address criticism that guarantees for stochastic acceleration could be worse than those for SGD.

A Library of Mirrors: Deep Neural Nets in Low Dimensions are Convex Lasso Models with Reflection Features

no code implementations2 Mar 2024 Emi Zeger, Yifei Wang, Aaron Mishkin, Tolga Ergen, Emmanuel Candès, Mert Pilanci

We prove that training neural networks on 1-D data is equivalent to solving a convex Lasso problem with a fixed, explicitly defined dictionary matrix of features.

Adaptive Inference: Theoretical Limits and Unexplored Opportunities

no code implementations6 Feb 2024 Soheil Hor, Ying Qian, Mert Pilanci, Amin Arbabian

This paper introduces the first theoretical framework for quantifying the efficiency and performance gain opportunity size of adaptive inference algorithms.

Convex Relaxations of ReLU Neural Networks Approximate Global Optima in Polynomial Time

no code implementations6 Feb 2024 Sungyoon Kim, Mert Pilanci

In this paper, we study the optimality gap between two-layer ReLU networks regularized with weight decay and their convex relaxations.

Riemannian Preconditioned LoRA for Fine-Tuning Foundation Models

1 code implementation4 Feb 2024 Fangzhao Zhang, Mert Pilanci

In this work we study the enhancement of Low Rank Adaptation (LoRA) fine-tuning procedure by introducing a Riemannian preconditioner in its optimization step.

The Convex Landscape of Neural Networks: Characterizing Global Optima and Stationary Points via Lasso Models

1 code implementation19 Dec 2023 Tolga Ergen, Mert Pilanci

We also show that all the stationary of the nonconvex training objective can be characterized as the global optimum of a subsampled convex program.

Volumetric Reconstruction Resolves Off-Resonance Artifacts in Static and Dynamic PROPELLER MRI

1 code implementation22 Nov 2023 Annesha Ghosh, Gordon Wetzstein, Mert Pilanci, Sara Fridovich-Keil

Off-resonance artifacts in magnetic resonance imaging (MRI) are visual distortions that occur when the actual resonant frequencies of spins within the imaging volume differ from the expected frequencies used to encode spatial information.

MRI Reconstruction

Polynomial-Time Solutions for ReLU Network Training: A Complexity Classification via Max-Cut and Zonotopes

no code implementations18 Nov 2023 Yifei Wang, Mert Pilanci

Using this convex formulation, we prove that the hardness of approximation of ReLU networks not only mirrors the complexity of the Max-Cut problem but also, in certain special cases, exactly corresponds to it.

Matrix Compression via Randomized Low Rank and Low Precision Factorization

1 code implementation NeurIPS 2023 Rajarshi Saha, Varun Srivastava, Mert Pilanci

We propose an algorithm that exploits this structure to obtain a low rank decomposition of any matrix $\mathbf{A}$ as $\mathbf{A} \approx \mathbf{L}\mathbf{R}$, where $\mathbf{L}$ and $\mathbf{R}$ are the low rank factors.

Image Compression Quantization

From Complexity to Clarity: Analytical Expressions of Deep Neural Network Weights via Clifford's Geometric Algebra and Convexity

no code implementations28 Sep 2023 Mert Pilanci

In this paper, we introduce a novel analysis of neural networks based on geometric (Clifford) algebra and convex optimization.

Randomized Polar Codes for Anytime Distributed Machine Learning

no code implementations1 Sep 2023 Burak Bartan, Mert Pilanci

We present a novel distributed computing framework that is robust to slow compute nodes, and is capable of both approximate and exact computation of linear operations.

Cloud Computing Distributed Computing

Iterative Sketching for Secure Coded Regression

no code implementations8 Aug 2023 Neophytos Charalambides, Hessam Mahdavifar, Mert Pilanci, Alfred O. Hero III

Linear regression is a fundamental and primitive problem in supervised machine learning, with applications ranging from epidemiology to finance.

Distributed Computing Epidemiology +1

Gradient Coding through Iterative Block Leverage Score Sampling

no code implementations6 Aug 2023 Neophytos Charalambides, Mert Pilanci, Alfred Hero

This is then used to derive an approximate coded computing approach for first-order methods; known as gradient coding, to accelerate linear regression in the presence of failures in distributed computational networks, \textit{i. e.} stragglers.

regression

Optimal Sets and Solution Paths of ReLU Networks

1 code implementation31 May 2023 Aaron Mishkin, Mert Pilanci

We show that the global optima of the convex parameterization are given by a polyhedral set and then extend this characterization to the optimal set of the non-convex training objective.

Globally Optimal Training of Neural Networks with Threshold Activation Functions

no code implementations6 Mar 2023 Tolga Ergen, Halil Ibrahim Gulluk, Jonathan Lacotte, Mert Pilanci

We first show that regularized deep threshold network training problems can be equivalently formulated as a standard convex optimization problem, which parallels the LASSO method, provided that the last hidden layer width exceeds a certain threshold.

Complex Clipping for Improved Generalization in Machine Learning

no code implementations27 Feb 2023 Les Atlas, Nicholas Rasmussen, Felix Schwock, Mert Pilanci

For many machine learning applications, a common input representation is a spectrogram.

Overparameterized ReLU Neural Networks Learn the Simplest Models: Neural Isometry and Exact Recovery

1 code implementation30 Sep 2022 Yifei Wang, Yixuan Hua, Emmanuel Candés, Mert Pilanci

For randomly generated data, we show the existence of a phase transition in recovering planted neural network models, which is easy to describe: whenever the ratio between the number of samples and the dimension exceeds a numerical threshold, the recovery succeeds with high probability; otherwise, it fails with high probability.

Optimal Neural Network Approximation of Wasserstein Gradient Direction via Convex Optimization

1 code implementation26 May 2022 Yifei Wang, Peng Chen, Mert Pilanci, Wuchen Li

We study the variational problem in the family of two-layer networks with squared-ReLU activations, towards which we derive a semi-definite programming (SDP) relaxation.

Bayesian Inference

Unraveling Attention via Convex Duality: Analysis and Interpretations of Vision Transformers

no code implementations17 May 2022 Arda Sahiner, Tolga Ergen, Batu Ozturkler, John Pauly, Morteza Mardani, Mert Pilanci

Vision transformers using self-attention or its proposed alternatives have demonstrated promising results in many image related tasks.

Inductive Bias

Scale-Equivariant Unrolled Neural Networks for Data-Efficient Accelerated MRI Reconstruction

1 code implementation21 Apr 2022 Beliz Gunel, Arda Sahiner, Arjun D. Desai, Akshay S. Chaudhari, Shreyas Vasanawala, Mert Pilanci, John Pauly

Unrolled neural networks have enabled state-of-the-art reconstruction performance and fast inference times for the accelerated magnetic resonance imaging (MRI) reconstruction task.

MRI Reconstruction

Approximate Function Evaluation via Multi-Armed Bandits

no code implementations18 Mar 2022 Tavor Z. Baharav, Gary Cheng, Mert Pilanci, David Tse

We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-\delta$ returns an $\epsilon$ accurate estimate of $f(\boldsymbol{\mu})$.

Multi-Armed Bandits

Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration and Lower Bounds

no code implementations18 Mar 2022 Burak Bartan, Mert Pilanci

Furthermore, we develop unbiased parameter averaging methods for randomized second order optimization for regularized problems that employ sketching of the Hessian.

Cloud Computing Distributed Optimization

Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms

no code implementations23 Feb 2022 Rajarshi Saha, Mert Pilanci, Andrea J. Goldsmith

We derive an information-theoretic lower bound for the minimax risk under this setting and propose a matching upper bound using randomized embedding-based algorithms which is tight up to constant factors.

Quantization

Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions

1 code implementation2 Feb 2022 Aaron Mishkin, Arda Sahiner, Mert Pilanci

We develop fast algorithms and robust software for convex optimization of two-layer neural networks with ReLU activation functions.

Image Classification

Using a Novel COVID-19 Calculator to Measure Positive U.S. Socio-Economic Impact of a COVID-19 Pre-Screening Solution (AI/ML)

1 code implementation21 Jan 2022 Richard Swartzbaugh, Amil Khanzada, Praveen Govindan, Mert Pilanci, Ayomide Owoyemi, Les Atlas, Hugo Estrada, Richard Nall, Michael Lotito, Rich Falcone, Jennifer Ranjani J

The COVID-19 pandemic has been a scourge upon humanity, claiming the lives of more than 5. 1 million people worldwide; the global economy contracted by 3. 5% in 2020.

Parallel Deep Neural Networks Have Zero Duality Gap

no code implementations13 Oct 2021 Yifei Wang, Tolga Ergen, Mert Pilanci

Recent work has proven that the strong duality holds (which means zero duality gap) for regularized finite-width two-layer ReLU networks and consequently provided an equivalent convex training problem.

The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program

no code implementations ICLR 2022 Yifei Wang, Mert Pilanci

We then show that the limit points of non-convex subgradient flows can be identified via primal-dual correspondence in this convex optimization problem.

Global Optimality Beyond Two Layers: Training Deep ReLU Networks via Convex Programs

no code implementations11 Oct 2021 Tolga Ergen, Mert Pilanci

We first show that the training of multiple three-layer ReLU sub-networks with weight decay regularization can be equivalently cast as a convex optimization problem in a higher dimensional space, where sparsity is enforced via a group $\ell_1$-norm regularization.

feature selection

The Hidden Convex Optimization Landscape of Regularized Two-Layer ReLU Networks: an Exact Characterization of Optimal Solutions

no code implementations ICLR 2022 Yifei Wang, Jonathan Lacotte, Mert Pilanci

As additional consequences of our convex perspective, (i) we establish that Clarke stationary points found by stochastic gradient descent correspond to the global optimum of a subsampled convex problem (ii) we provide a polynomial-time algorithm for checking if a neural network is a global minimum of the training loss (iii) we provide an explicit construction of a continuous path between any neural network and the global minimum of its sublevel set and (iv) characterize the minimal size of the hidden layer so that the neural network optimization landscape has no spurious valleys.

Newton-LESS: Sparsification without Trade-offs for the Sketched Newton Update

1 code implementation NeurIPS 2021 Michał Dereziński, Jonathan Lacotte, Mert Pilanci, Michael W. Mahoney

In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration.

Hidden Convexity of Wasserstein GANs: Interpretable Generative Models with Closed-Form Solutions

1 code implementation ICLR 2022 Arda Sahiner, Tolga Ergen, Batu Ozturkler, Burak Bartan, John Pauly, Morteza Mardani, Mert Pilanci

In this work, we analyze the training of Wasserstein GANs with two-layer neural network discriminators through the lens of convex duality, and for a variety of generators expose the conditions under which Wasserstein GANs can be solved exactly with convex optimization approaches, or can be represented as convex-concave games.

Image Generation

Adaptive Newton Sketch: Linear-time Optimization with Quadratic Convergence and Effective Hessian Dimensionality

no code implementations15 May 2021 Jonathan Lacotte, Yifei Wang, Mert Pilanci

Our first contribution is to show that, at each iteration, the embedding dimension (or sketch size) can be as small as the effective dimension of the Hessian matrix.

Training Quantized Neural Networks to Global Optimality via Semidefinite Programming

no code implementations4 May 2021 Burak Bartan, Mert Pilanci

Neural networks (NNs) have been extremely successful across many tasks in machine learning.

Quantization

Fast Convex Quadratic Optimization Solvers with Adaptive Sketching-based Preconditioners

1 code implementation29 Apr 2021 Jonathan Lacotte, Mert Pilanci

We propose an adaptive mechanism to control the sketch size according to the progress made in each step of the iterative solver.

Efficient Randomized Subspace Embeddings for Distributed Optimization under a Communication Budget

1 code implementation13 Mar 2021 Rajarshi Saha, Mert Pilanci, Andrea J. Goldsmith

As a consequence, quantizing these embeddings followed by an inverse transform to the original space yields a source coding method with optimal covering efficiency while utilizing just $R$-bits per dimension.

Distributed Optimization Quantization

Neural Spectrahedra and Semidefinite Lifts: Global Convex Optimization of Polynomial Activation Neural Networks in Fully Polynomial-Time

no code implementations7 Jan 2021 Burak Bartan, Mert Pilanci

In this paper, we develop exact convex optimization formulations for two-layer neural networks with second degree polynomial activations based on semidefinite programming.

Adaptive and Oblivious Randomized Subspace Methods for High-Dimensional Optimization: Sharp Analysis and Lower Bounds

no code implementations13 Dec 2020 Jonathan Lacotte, Mert Pilanci

We propose novel randomized optimization methods for high-dimensional convex problems based on restrictions of variables to random subspaces.

Convex Regularization Behind Neural Reconstruction

no code implementations ICLR 2021 Arda Sahiner, Morteza Mardani, Batu Ozturkler, Mert Pilanci, John Pauly

Neural networks have shown tremendous potential for reconstructing high-resolution images in inverse problems.

Denoising

Optimal Iterative Sketching Methods with the Subsampled Randomized Hadamard Transform

no code implementations NeurIPS 2020 Jonathan Lacotte, Sifan Liu, Edgar Dobriban, Mert Pilanci

These show that the convergence rate for Haar and randomized Hadamard matrices are identical, and asymptotically improve upon Gaussian random projections.

Dimensionality Reduction

Linear Predictive Coding for Acute Stress Prediction from Computer Mouse Movements

no code implementations26 Oct 2020 Lawrence H. Kim, Rahul Goel, Jia Liang, Mert Pilanci, Pablo E. Paredes

This work demonstrates that the damping frequency and damping ratio from LPC are significantly correlated with those from an MSD model, thus confirming the validity of using LPC to infer muscle stiffness and damping.

Binary Classification

Debiasing Distributed Second Order Optimization with Surrogate Sketching and Scaled Regularization

no code implementations NeurIPS 2020 Michał Dereziński, Burak Bartan, Mert Pilanci, Michael W. Mahoney

In distributed second order optimization, a standard strategy is to average many local estimates, each of which is based on a small sketch or batch of the data.

Point Processes Second-order methods

Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time

no code implementations ICLR 2021 Tolga Ergen, Mert Pilanci

We study training of Convolutional Neural Networks (CNNs) with ReLU activations and introduce exact convex optimization formulations with a polynomial complexity with respect to the number of data samples, the number of neurons, and data dimension.

Lower Bounds and a Near-Optimal Shrinkage Estimator for Least Squares using Random Projections

no code implementations15 Jun 2020 Srivatsan Sridhar, Mert Pilanci, Ayfer Özgür

An upper bound on the expected error of this estimator is derived, which is smaller than the error of the classical Gaussian sketch solution for any given data.

The Hidden Convex Optimization Landscape of Two-Layer ReLU Neural Networks: an Exact Characterization of the Optimal Solutions

no code implementations10 Jun 2020 Yifei Wang, Jonathan Lacotte, Mert Pilanci

As additional consequences of our convex perspective, (i) we establish that Clarke stationary points found by stochastic gradient descent correspond to the global optimum of a subsampled convex problem (ii) we provide a polynomial-time algorithm for checking if a neural network is a global minimum of the training loss (iii) we provide an explicit construction of a continuous path between any neural network and the global minimum of its sublevel set and (iv) characterize the minimal size of the hidden layer so that the neural network optimization landscape has no spurious valleys.

Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization

no code implementations NeurIPS 2020 Jonathan Lacotte, Mert Pilanci

Our method starts with an initial embedding dimension equal to 1 and, over iterations, increases the embedding dimension up to the effective one at most.

Global Multiclass Classification and Dataset Construction via Heterogeneous Local Experts

no code implementations21 May 2020 Surin Ahn, Ayfer Ozgur, Mert Pilanci

In the domains of dataset construction and crowdsourcing, a notable challenge is to aggregate labels from a heterogeneous set of labelers, each of whom is potentially an expert in some subset of tasks (and less reliable in others).

Classification Federated Learning +1

Separating the Effects of Batch Normalization on CNN Training Speed and Stability Using Classical Adaptive Filter Theory

no code implementations25 Feb 2020 Elaina Chai, Mert Pilanci, Boris Murmann

Batch Normalization (BatchNorm) is commonly used in Convolutional Neural Networks (CNNs) to improve training speed and stability.

Convex Geometry and Duality of Over-parameterized Neural Networks

no code implementations25 Feb 2020 Tolga Ergen, Mert Pilanci

Our analysis also shows that optimal network parameters can be also characterized as interpretable closed-form formulas in some practically relevant special cases.

Neural Networks are Convex Regularizers: Exact Polynomial-time Convex Optimization Formulations for Two-layer Networks

no code implementations ICML 2020 Mert Pilanci, Tolga Ergen

We develop exact representations of training two-layer neural networks with rectified linear units (ReLUs) in terms of a single convex program with number of variables polynomial in the number of training samples and the number of hidden neurons.

Revealing the Structure of Deep Neural Networks via Convex Duality

no code implementations22 Feb 2020 Tolga Ergen, Mert Pilanci

We show that a set of optimal hidden layer weights for a norm regularized DNN training problem can be explicitly found as the extreme points of a convex set.

Optimal Randomized First-Order Methods for Least-Squares Problems

no code implementations ICML 2020 Jonathan Lacotte, Mert Pilanci

Then, we propose a new algorithm by optimizing the computational complexity over the choice of the sketching dimension.

Distributed Averaging Methods for Randomized Second Order Optimization

no code implementations16 Feb 2020 Burak Bartan, Mert Pilanci

We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a significant bottleneck.

Distributed Optimization

Global Convergence of Frank Wolfe on One Hidden Layer Networks

no code implementations6 Feb 2020 Alexandre d'Aspremont, Mert Pilanci

The classical Frank Wolfe algorithm then converges with rate $O(1/T)$ where $T$ is both the number of neurons and the number of calls to the oracle.

Optimal Iterative Sketching with the Subsampled Randomized Hadamard Transform

no code implementations3 Feb 2020 Jonathan Lacotte, Sifan Liu, Edgar Dobriban, Mert Pilanci

These show that the convergence rate for Haar and randomized Hadamard matrices are identical, and asymptotically improve upon Gaussian random projections.

Dimensionality Reduction

Regularized Momentum Iterative Hessian Sketch for Large Scale Linear System of Equations

no code implementations7 Dec 2019 Ibrahim Kurban Ozaslan, Mert Pilanci, Orhan Arikan

In this article, Momentum Iterative Hessian Sketch (M-IHS) techniques, a group of solvers for large scale linear Least Squares (LS) problems, are proposed and analyzed in detail.

Optimization and Control Computational Complexity 15B52, 65F08, 65F10, 65F22, 65F50, 68W20, 90C06

Distributed Black-Box Optimization via Error Correcting Codes

no code implementations13 Jul 2019 Burak Bartan, Mert Pilanci

We introduce a novel distributed derivative-free optimization framework that is resilient to stragglers.

High-Dimensional Optimization in Adaptive Random Subspaces

no code implementations NeurIPS 2019 Jonathan Lacotte, Mert Pilanci, Marco Pavone

We propose a new randomized optimization method for high-dimensional problems which can be seen as a generalization of coordinate descent to random subspaces.

Vocal Bursts Intensity Prediction

Straggler Resilient Serverless Computing Based on Polar Codes

no code implementations21 Jan 2019 Burak Bartan, Mert Pilanci

We propose a serverless computing mechanism for distributed computation based on polar codes.

Convex Relaxations of Convolutional Neural Nets

no code implementations31 Dec 2018 Burak Bartan, Mert Pilanci

We propose convex relaxations for convolutional neural nets with one hidden layer where the output weights are fixed.

Newton Sketch: A Linear-time Optimization Algorithm with Linear-Quadratic Convergence

no code implementations9 May 2015 Mert Pilanci, Martin J. Wainwright

We also describe extensions of our methods to programs involving convex constraints that are equipped with self-concordant barriers.

Randomized sketches for kernels: Fast and optimal non-parametric regression

no code implementations25 Jan 2015 Yun Yang, Mert Pilanci, Martin J. Wainwright

Kernel ridge regression (KRR) is a standard method for performing non-parametric regression over reproducing kernel Hilbert spaces.

regression

Iterative Hessian sketch: Fast and accurate solution approximation for constrained least-squares

no code implementations3 Nov 2014 Mert Pilanci, Martin J. Wainwright

We study randomized sketching methods for approximately solving least-squares problem with a general convex constraint.

Randomized Sketches of Convex Programs with Sharp Guarantees

no code implementations29 Apr 2014 Mert Pilanci, Martin J. Wainwright

We analyze RP-based approximations of convex programs, in which the original optimization problem is approximated by the solution of a lower-dimensional problem.

Dimensionality Reduction

Recovery of Sparse Probability Measures via Convex Programming

no code implementations NeurIPS 2012 Mert Pilanci, Laurent E. Ghaoui, Venkat Chandrasekaran

We propose a direct relaxation of the minimum cardinality problem and show that it can be efficiently solved using convex programming.

Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.