Search Results for author: Murat A. Erdogdu

Found 40 papers, 5 papers with code

Sampling from the Mean-Field Stationary Distribution

no code implementations12 Feb 2024 Yunbum Kook, Matthew S. Zhang, Sinho Chewi, Murat A. Erdogdu, Mufan Bill Li

We study the complexity of sampling from the stationary distribution of a mean-field SDE, or equivalently, the complexity of minimizing a functional over the space of probability measures which includes an interaction term.

Beyond Labeling Oracles: What does it mean to steal ML models?

no code implementations3 Oct 2023 Avital Shafran, Ilia Shumailov, Murat A. Erdogdu, Nicolas Papernot

We discover that prior knowledge of the attacker, i. e. access to in-distribution data, dominates other factors like the attack policy the adversary follows to choose which queries to make to the victim model API.

Model extraction

Towards a Complete Analysis of Langevin Monte Carlo: Beyond Poincaré Inequality

no code implementations7 Mar 2023 Alireza Mousavi-Hosseini, Tyler Farghly, Ye He, Krishnakumar Balasubramanian, Murat A. Erdogdu

We do so by establishing upper and lower bounds for Langevin diffusions and LMC under weak Poincar\'e inequalities that are satisfied by a large class of densities including polynomially-decaying heavy-tailed densities (i. e., Cauchy-type).

Mean-Square Analysis of Discretized Itô Diffusions for Heavy-tailed Sampling

no code implementations1 Mar 2023 Ye He, Tyler Farghly, Krishnakumar Balasubramanian, Murat A. Erdogdu

We analyze the complexity of sampling from a class of heavy-tailed distributions by discretizing a natural class of It\^o diffusions associated with weighted Poincar\'e inequalities.

Improved Discretization Analysis for Underdamped Langevin Monte Carlo

no code implementations16 Feb 2023 Matthew Zhang, Sinho Chewi, Mufan Bill Li, Krishnakumar Balasubramanian, Murat A. Erdogdu

As a byproduct, we also obtain the first KL divergence guarantees for ULMC without Hessian smoothness under strong log-concavity, which is based on a new result on the log-Sobolev constant along the underdamped Langevin diffusion.

Neural Networks Efficiently Learn Low-Dimensional Representations with SGD

no code implementations29 Sep 2022 Alireza Mousavi-Hosseini, Sejun Park, Manuela Girotti, Ioannis Mitliagkas, Murat A. Erdogdu

We further demonstrate that, SGD-trained ReLU NNs can learn a single-index target of the form $y=f(\langle\boldsymbol{u},\boldsymbol{x}\rangle) + \epsilon$ by recovering the principal direction, with a sample complexity linear in $d$ (up to log factors), where $f$ is a monotonic function with at most polynomial growth, and $\epsilon$ is the noise.

$p$-DkNN: Out-of-Distribution Detection Through Statistical Testing of Deep Representations

no code implementations25 Jul 2022 Adam Dziedzic, Stephan Rabanser, Mohammad Yaghini, Armin Ale, Murat A. Erdogdu, Nicolas Papernot

We introduce $p$-DkNN, a novel inference procedure that takes a trained deep neural network and analyzes the similarity structures of its intermediate hidden representations to compute $p$-values associated with the end-to-end model prediction.

Autonomous Driving Out-of-Distribution Detection +1

High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation

no code implementations3 May 2022 Jimmy Ba, Murat A. Erdogdu, Taiji Suzuki, Zhichao Wang, Denny Wu, Greg Yang

We study the first gradient descent step on the first-layer parameters $\boldsymbol{W}$ in a two-layer neural network: $f(\boldsymbol{x}) = \frac{1}{\sqrt{N}}\boldsymbol{a}^\top\sigma(\boldsymbol{W}^\top\boldsymbol{x})$, where $\boldsymbol{W}\in\mathbb{R}^{d\times N}, \boldsymbol{a}\in\mathbb{R}^{N}$ are randomly initialized, and the training objective is the empirical MSE loss: $\frac{1}{n}\sum_{i=1}^n (f(\boldsymbol{x}_i)-y_i)^2$.

Towards a Theory of Non-Log-Concave Sampling: First-Order Stationarity Guarantees for Langevin Monte Carlo

no code implementations10 Feb 2022 Krishnakumar Balasubramanian, Sinho Chewi, Murat A. Erdogdu, Adil Salim, Matthew Zhang

For the task of sampling from a density $\pi \propto \exp(-V)$ on $\mathbb{R}^d$, where $V$ is possibly non-convex but $L$-gradient Lipschitz, we prove that averaged Langevin Monte Carlo outputs a sample with $\varepsilon$-relative Fisher information after $O( L^2 d^2/\varepsilon^2)$ iterations.

Heavy-tailed Sampling via Transformed Unadjusted Langevin Algorithm

no code implementations20 Jan 2022 Ye He, Krishnakumar Balasubramanian, Murat A. Erdogdu

We analyze the oracle complexity of sampling from polynomially decaying heavy-tailed target densities based on running the Unadjusted Langevin Algorithm on certain transformed versions of the target density.

Analysis of Langevin Monte Carlo from Poincaré to Log-Sobolev

no code implementations23 Dec 2021 Sinho Chewi, Murat A. Erdogdu, Mufan Bill Li, Ruoqi Shen, Matthew Zhang

Classically, the continuous-time Langevin diffusion converges exponentially fast to its stationary distribution $\pi$ under the sole assumption that $\pi$ satisfies a Poincar\'e inequality.

Convergence and Optimality of Policy Gradient Methods in Weakly Smooth Settings

no code implementations30 Oct 2021 Matthew S. Zhang, Murat A. Erdogdu, Animesh Garg

Policy gradient methods have been frequently applied to problems in control and reinforcement learning with great success, yet existing convergence analysis still relies on non-intuitive, impractical and often opaque conditions.

Policy Gradient Methods reinforcement-learning +1

On Empirical Risk Minimization with Dependent and Heavy-Tailed Data

no code implementations NeurIPS 2021 Abhishek Roy, Krishnakumar Balasubramanian, Murat A. Erdogdu

In this work, we establish risk bounds for the Empirical Risk Minimization (ERM) with both dependent and heavy-tailed data-generating processes.

Learning Theory

Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms

no code implementations NeurIPS 2021 Alexander Camuto, George Deligiannidis, Murat A. Erdogdu, Mert Gürbüzbalaban, Umut Şimşekli, Lingjiong Zhu

As our main contribution, we prove that the generalization error of a stochastic optimization algorithm can be bounded based on the `complexity' of the fractal structure that underlies its invariant measure.

Generalization Bounds Learning Theory +1

Heavy Tails in SGD and Compressibility of Overparametrized Neural Networks

1 code implementation NeurIPS 2021 Melih Barsbey, Milad Sefidgaran, Murat A. Erdogdu, Gaël Richard, Umut Şimşekli

Neural network compression techniques have become increasingly popular as they can drastically reduce the storage and computation requirements for very large networks.

Generalization Bounds Neural Network Compression

Convergence Rates of Stochastic Gradient Descent under Infinite Noise Variance

no code implementations NeurIPS 2021 Hongjian Wang, Mert Gürbüzbalaban, Lingjiong Zhu, Umut Şimşekli, Murat A. Erdogdu

In this paper, we provide convergence guarantees for SGD under a state-dependent and heavy-tailed noise with a potentially infinite variance, for a class of strongly convex objectives.

On the Ergodicity, Bias and Asymptotic Normality of Randomized Midpoint Sampling Method

no code implementations NeurIPS 2020 Ye He, Krishnakumar Balasubramanian, Murat A. Erdogdu

The randomized midpoint method, proposed by [SL19], has emerged as an optimal discretization procedure for simulating the continuous time Langevin diffusions.

Numerical Integration

Riemannian Langevin Algorithm for Solving Semidefinite Programs

no code implementations21 Oct 2020 Mufan Bill Li, Murat A. Erdogdu

We propose a Langevin diffusion-based algorithm for non-convex optimization and sampling on a product manifold of spheres.

Convergence of Langevin Monte Carlo in Chi-Squared and Renyi Divergence

no code implementations22 Jul 2020 Murat A. Erdogdu, Rasa Hosseinzadeh, Matthew S. Zhang

We prove that, initialized with a Gaussian random vector that has sufficiently small variance, iterating the LMC algorithm for $\widetilde{\mathcal{O}}(\lambda^2 d\epsilon^{-1})$ steps is sufficient to reach $\epsilon$-neighborhood of the target in both Chi-squared and Renyi divergence, where $\lambda$ is the logarithmic Sobolev constant of $\nu_*$.

Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks

1 code implementation NeurIPS 2020 Umut Şimşekli, Ozan Sener, George Deligiannidis, Murat A. Erdogdu

Despite its success in a wide range of applications, characterizing the generalization properties of stochastic gradient descent (SGD) in non-convex deep learning problems is still an important challenge.

Generalization Bounds

An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias

no code implementations NeurIPS 2021 Lu Yu, Krishnakumar Balasubramanian, Stanislav Volgushev, Murat A. Erdogdu

Structured non-convex learning problems, for which critical points have favorable statistical properties, arise frequently in statistical machine learning.

On the Convergence of Langevin Monte Carlo: The Interplay between Tail Growth and Smoothness

no code implementations27 May 2020 Murat A. Erdogdu, Rasa Hosseinzadeh

This convergence rate, in terms of $\epsilon$ dependency, is not directly influenced by the tail growth rate $\alpha$ of the potential function as long as its growth is at least linear, and it only relies on the order of smoothness $\beta$.

Towards Characterizing the High-dimensional Bias of Kernel-based Particle Inference Algorithms

no code implementations pproximateinference AABI Symposium 2019 Jimmy Ba, Murat A. Erdogdu, Marzyeh Ghassemi, Taiji Suzuki, Shengyang Sun, Denny Wu, Tianzong Zhang

Particle-based inference algorithm is a promising method to efficiently generate samples for an intractable target distribution by iteratively updating a set of particles.

LEMMA

Stochastic Runge-Kutta Accelerates Langevin Monte Carlo and Beyond

no code implementations NeurIPS 2019 Xuechen Li, Denny Wu, Lester Mackey, Murat A. Erdogdu

In this paper, we establish the convergence rate of sampling algorithms obtained by discretizing smooth It\^o diffusions exhibiting fast Wasserstein-$2$ contraction, based on local deviation properties of the integration scheme.

Numerical Integration

Normal Approximation for Stochastic Gradient Descent via Non-Asymptotic Rates of Martingale CLT

no code implementations3 Apr 2019 Andreas Anastasiou, Krishnakumar Balasubramanian, Murat A. Erdogdu

A crucial intermediate step is proving a non-asymptotic martingale central limit theorem (CLT), i. e., establishing the rates of convergence of a multivariate martingale difference sequence to a normal random vector, which might be of independent interest.

valid

Global Non-convex Optimization with Discretized Diffusions

no code implementations NeurIPS 2018 Murat A. Erdogdu, Lester Mackey, Ohad Shamir

An Euler discretization of the Langevin diffusion is known to converge to the global minimizers of certain convex and non-convex optimization problems.

Convergence Rate of Block-Coordinate Maximization Burer-Monteiro Method for Solving Large SDPs

no code implementations12 Jul 2018 Murat A. Erdogdu, Asuman Ozdaglar, Pablo A. Parrilo, Nuri Denizcan Vanli

Furthermore, incorporating Lanczos method to the block-coordinate maximization, we propose an algorithm that is guaranteed to return a solution that provides $1-O(1/r)$ approximation to the original SDP without any assumptions, where $r$ is the rank of the factorization.

Community Detection

Robust Estimation of Neural Signals in Calcium Imaging

no code implementations NeurIPS 2017 Hakan Inan, Murat A. Erdogdu, Mark Schnitzer

We use our proposed robust loss in a matrix factorization framework to extract the neurons and their temporal activity in calcium imaging datasets.

Inference in Graphical Models via Semidefinite Programming Hierarchies

no code implementations NeurIPS 2017 Murat A. Erdogdu, Yash Deshpande, Andrea Montanari

We demonstrate that the resulting algorithm can solve problems with tens of thousands of variables within minutes, and outperforms BP and GBP on practical problems such as image denoising and Ising spin glasses.

Combinatorial Optimization Computational Efficiency +1

Scaled Least Squares Estimator for GLMs in Large-Scale Problems

no code implementations NeurIPS 2016 Murat A. Erdogdu, Lee H. Dicker, Mohsen Bayati

We study the problem of efficiently estimating the coefficients of generalized linear models (GLMs) in the large-scale setting where the number of observations $n$ is much larger than the number of predictors $p$, i. e. $n\gg p \gg 1$.

Scalable Approximations for Generalized Linear Problems

no code implementations21 Nov 2016 Murat A. Erdogdu, Mohsen Bayati, Lee H. Dicker

Using this relation, we design an algorithm that achieves the same accuracy as the empirical risk minimizer through iterations that attain up to a cubic convergence rate, and that are cheaper than any batch optimization algorithm by at least a factor of $\mathcal{O}(p)$.

Binary Classification General Classification +2

Newton-Stein Method: A Second Order Method for GLMs via Stein's Lemma

no code implementations NeurIPS 2015 Murat A. Erdogdu

We consider the problem of efficiently computing the maximum likelihood estimator in Generalized Linear Models (GLMs)when the number of observations is much larger than the number of coefficients (n > > p > > 1).

LEMMA Second-order methods

Newton-Stein Method: An optimization method for GLMs via Stein's Lemma

no code implementations28 Nov 2015 Murat A. Erdogdu

We consider the problem of efficiently computing the maximum likelihood estimator in Generalized Linear Models (GLMs) when the number of observations is much larger than the number of coefficients ($n \gg p \gg 1$).

LEMMA Second-order methods

Convergence rates of sub-sampled Newton methods

no code implementations NeurIPS 2015 Murat A. Erdogdu, Andrea Montanari

In this regime, algorithms which utilize sub-sampling techniques are known to be effective.

SEISMIC: A Self-Exciting Point Process Model for Predicting Tweet Popularity

1 code implementation8 Jun 2015 Qingyuan Zhao, Murat A. Erdogdu, Hera Y. He, Anand Rajaraman, Jure Leskovec

Social networking websites allow users to create and share content.

Social and Information Networks Physics and Society Applications 60G55, 62P25 H.2.8

Estimating LASSO Risk and Noise Level

no code implementations NeurIPS 2013 Mohsen Bayati, Murat A. Erdogdu, Andrea Montanari

In this context, we develop new estimators for the $\ell_2$ estimation risk $\|\hat{\theta}-\theta_0\|_2$ and the variance of the noise.

Denoising

Cannot find the paper you are looking for? You can Submit a new open access paper.