Search Results for author: Sivaraman Balakrishnan

Found 45 papers, 1 papers with code

Heavy-tailed Streaming Statistical Estimation

no code implementations25 Aug 2021 Che-Ping Tsai, Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Ravikumar

We consider the task of heavy-tailed statistical estimation given streaming $p$-dimensional samples.

Stochastic Optimization

Plugin Estimation of Smooth Optimal Transport Maps

no code implementations26 Jul 2021 Tudor Manole, Sivaraman Balakrishnan, Jonathan Niles-Weed, Larry Wasserman

We analyze a number of natural estimators for the optimal transport map between two distributions and show that they are minimax optimal.

Minimax Optimal Regression over Sobolev Spaces via Laplacian Regularization on Neighborhood Graphs

no code implementations3 Jun 2021 Alden Green, Sivaraman Balakrishnan, Ryan J. Tibshirani

In this paper we study the statistical properties of Laplacian smoothing, a graph-based approach to nonparametric regression.

RATT: Leveraging Unlabeled Data to Guarantee Generalization

no code implementations1 May 2021 Saurabh Garg, Sivaraman Balakrishnan, J. Zico Kolter, Zachary C. Lipton

To assess generalization, machine learning scientists typically either (i) bound the generalization gap and then (after training) plug in the empirical risk to obtain a bound on the true risk; or (ii) validate empirically on holdout data.

Generalization Bounds

On Proximal Policy Optimization's Heavy-tailed Gradients

no code implementations20 Feb 2021 Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, J. Zico Kolter, Zachary C. Lipton, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Pradeep Ravikumar

In this paper, we present a detailed empirical study to characterize the heavy-tailed nature of the gradients of the PPO surrogate reward function.

Continuous Control

Efficient Estimators for Heavy-Tailed Machine Learning

no code implementations1 Jan 2021 Vishwak Srinivasan, Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Kumar Ravikumar

A dramatic improvement in data collection technologies has aided in procuring massive amounts of unstructured and heterogeneous datasets.

On Learning Ising Models under Huber's Contamination Model

no code implementations NeurIPS 2020 Adarsh Prasad, Vishwak Srinivasan, Sivaraman Balakrishnan, Pradeep Ravikumar

We study the problem of learning Ising models in a setting where some of the samples from the underlying distribution can be arbitrarily corrupted.

Two-Sample Testing on Ranked Preference Data and the Role of Modeling Assumptions

no code implementations21 Jun 2020 Charvi Rastogi, Sivaraman Balakrishnan, Nihar B. Shah, Aarti Singh

We also provide testing algorithms and associated sample complexity bounds for the problem of two-sample testing with partial (or total) ranking data. Furthermore, we empirically evaluate our results via extensive simulations as well as two real-world datasets consisting of pairwise comparisons.

Two-sample testing

A Unified View of Label Shift Estimation

no code implementations NeurIPS 2020 Saurabh Garg, Yifan Wu, Sivaraman Balakrishnan, Zachary C. Lipton

Our contributions include (i) consistency conditions for MLLS, which include calibration of the classifier and a confusion matrix invertibility condition that BBSE also requires; (ii) a unified framework, casting BBSE as roughly equivalent to MLLS for a particular choice of calibration method; and (iii) a decomposition of MLLS's finite-sample error into terms reflecting miscalibration and estimation error.

Universal Inference

no code implementations24 Dec 2019 Larry Wasserman, Aaditya Ramdas, Sivaraman Balakrishnan

Constructing tests and confidence sets for such models is notoriously difficult.

Minimax Confidence Intervals for the Sliced Wasserstein Distance

2 code implementations17 Sep 2019 Tudor Manole, Sivaraman Balakrishnan, Larry Wasserman

To motivate the choice of these classes, we also study minimax rates of estimating a distribution under the Sliced Wasserstein distance.

Path Length Bounds for Gradient Descent and Flow

no code implementations2 Aug 2019 Chirag Gupta, Sivaraman Balakrishnan, Aaditya Ramdas

We derive bounds on the path length $\zeta$ of gradient descent (GD) and gradient flow (GF) curves for various classes of smooth convex and nonconvex functions.

A Unified Approach to Robust Mean Estimation

no code implementations1 Jul 2019 Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Ravikumar

Building on this connection, we provide a simple variant of recent computationally-efficient algorithms for mean estimation in Huber's model, which given our connection entails that the same efficient sample-pruning based estimators is simultaneously robust to heavy-tailed noise and Huber contamination.

How Many Samples are Needed to Estimate a Convolutional Neural Network?

no code implementations NeurIPS 2018 Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan R. Salakhutdinov, Aarti Singh

We show that for an $m$-dimensional convolutional filter with linear activation acting on a $d$-dimensional input, the sample complexity of achieving population prediction error of $\epsilon$ is $\widetilde{O(m/\epsilon^2)$, whereas the sample-complexity for its FNN counterpart is lower bounded by $\Omega(d/\epsilon^2)$ samples.

Nonparametric Regression with Comparisons: Escaping the Curse of Dimensionality with Ordinal Information

no code implementations ICML 2018 Yichong Xu, Hariank Muthakana, Sivaraman Balakrishnan, Aarti Singh, Artur Dubrawski

Finally, we present experiments that show the efficacy of RR and investigate its robustness to various sources of noise and model-misspecification.

Regression with Comparisons: Escaping the Curse of Dimensionality with Ordinal Information

no code implementations ICML 2018 Yichong Xu, Sivaraman Balakrishnan, Aarti Singh, Artur Dubrawski

In supervised learning, we typically leverage a fully labeled dataset to design methods for function estimation or prediction.

Robust Nonparametric Regression under Huber's $ε$-contamination Model

no code implementations26 May 2018 Simon S. Du, Yining Wang, Sivaraman Balakrishnan, Pradeep Ravikumar, Aarti Singh

We first show that a simple local binning median step can effectively remove the adversary noise and this median estimator is minimax optimal up to absolute constants over the H\"{o}lder function class with smoothness parameters smaller than or equal to 1.

How Many Samples are Needed to Estimate a Convolutional or Recurrent Neural Network?

no code implementations NeurIPS 2018 Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Aarti Singh

It is widely believed that the practical success of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) owes to the fact that CNNs and RNNs use a more compact parametric representation than their Fully-Connected Neural Network (FNN) counterparts, and consequently require fewer training examples to accurately estimate their parameters.

Local White Matter Architecture Defines Functional Brain Dynamics

no code implementations22 Apr 2018 Yo Joong Choe, Sivaraman Balakrishnan, Aarti Singh, Jean M. Vettel, Timothy Verstynen

If communication efficiency is fundamentally constrained by the integrity along the entire length of a white matter bundle, then variability in the functional dynamics of brain networks should be associated with variability in the local connectome.

Variable Selection

Optimization of Smooth Functions with Noisy Observations: Local Minimax Rates

no code implementations NeurIPS 2018 Yining Wang, Sivaraman Balakrishnan, Aarti Singh

In this setup, an algorithm is allowed to adaptively query the underlying function at different locations and receives noisy evaluations of function values at the queried points (i. e. the algorithm has access to zeroth-order information).

Global Optimization

Robust Estimation via Robust Gradient Estimation

no code implementations19 Feb 2018 Adarsh Prasad, Arun Sai Suggala, Sivaraman Balakrishnan, Pradeep Ravikumar

We provide a new computationally-efficient class of estimators for risk minimization.

Hypothesis Testing for High-Dimensional Multinomials: A Selective Review

no code implementations17 Dec 2017 Sivaraman Balakrishnan, Larry Wasserman

The statistical analysis of discrete data has been the subject of extensive statistical research dating back to the work of Pearson.

Two-sample testing

Stochastic Zeroth-order Optimization in High Dimensions

no code implementations29 Oct 2017 Yining Wang, Simon Du, Sivaraman Balakrishnan, Aarti Singh

We consider the problem of optimizing a high-dimensional convex function using stochastic zeroth-order queries.

Feature Selection

Low Permutation-rank Matrices: Structural Properties and Noisy Completion

no code implementations1 Sep 2017 Nihar B. Shah, Sivaraman Balakrishnan, Martin J. Wainwright

We consider the problem of noisy matrix completion, in which the goal is to reconstruct a structured matrix whose entries are partially observed in noise.

Matrix Completion

Hypothesis Testing For Densities and High-Dimensional Multinomials: Sharp Local Minimax Rates

no code implementations30 Jun 2017 Sivaraman Balakrishnan, Larry Wasserman

In contrast to existing results, we show that the minimax rate and critical testing radius in these settings depend strongly, and in a precise way, on the null distribution being tested and this motivates the study of the (local) minimax rate as a function of the null distribution.

Two-sample testing

Computationally Efficient Robust Estimation of Sparse Functionals

no code implementations24 Feb 2017 Simon S. Du, Sivaraman Balakrishnan, Aarti Singh

Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions.

Rate Optimal Estimation and Confidence Intervals for High-dimensional Regression with Missing Covariates

no code implementations9 Feb 2017 Yining Wang, Jialei Wang, Sivaraman Balakrishnan, Aarti Singh

We consider the problems of estimation and of constructing component-wise confidence intervals in a sparse high-dimensional linear regression model when some covariates of the design matrix are missing completely at random.

Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences

no code implementations NeurIPS 2016 Chi Jin, Yuchen Zhang, Sivaraman Balakrishnan, Martin J. Wainwright, Michael Jordan

Our first main result shows that the population likelihood function has bad local maxima even in the special case of equally-weighted mixtures of well-separated and spherical Gaussians.

A Permutation-based Model for Crowd Labeling: Optimal Estimation and Robustness

no code implementations30 Jun 2016 Nihar B. Shah, Sivaraman Balakrishnan, Martin J. Wainwright

The task of aggregating and denoising crowd-labeled data has gained increased significance with the advent of crowdsourcing platforms and massive datasets.

Denoising

Arbitrage-Free Combinatorial Market Making via Integer Programming

no code implementations9 Jun 2016 Christian Kroer, Miroslav Dudík, Sébastien Lahaie, Sivaraman Balakrishnan

We present a new combinatorial market maker that operates arbitrage-free combinatorial prediction markets specified by integer programs.

Statistical Inference for Cluster Trees

no code implementations NeurIPS 2016 Jisu Kim, Yen-Chi Chen, Sivaraman Balakrishnan, Alessandro Rinaldo, Larry Wasserman

A cluster tree provides a highly-interpretable summary of a density function by representing the hierarchy of its high-density clusters.

Feeling the Bern: Adaptive Estimators for Bernoulli Probabilities of Pairwise Comparisons

no code implementations22 Mar 2016 Nihar B. Shah, Sivaraman Balakrishnan, Martin J. Wainwright

Second, we show that a regularized least squares estimator can achieve a poly-logarithmic adaptivity index, thereby demonstrating a $\sqrt{n}$-gap between optimal and computationally achievable adaptivity.

Statistical and Computational Guarantees for the Baum-Welch Algorithm

no code implementations27 Dec 2015 Fanny Yang, Sivaraman Balakrishnan, Martin J. Wainwright

By exploiting this characterization, we provide non-asymptotic finite sample guarantees on the Baum-Welch updates, guaranteeing geometric convergence to a small ball of radius on the order of the minimax rate around a global optimum.

Speech Recognition Time Series

Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues

no code implementations19 Oct 2015 Nihar B. Shah, Sivaraman Balakrishnan, Adityanand Guntuboyina, Martin J. Wainwright

On the other hand, unlike in the BTL and Thurstone models, computing the minimax-optimal estimator in the stochastically transitive model is non-trivial, and we explore various computationally tractable alternatives.

Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence

no code implementations6 May 2015 Nihar B. Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kannan Ramchandran, Martin J. Wainwright

Data in the form of pairwise comparisons arises in many domains, including preference elicitation, sporting competitions, and peer grading among others.

Statistical guarantees for the EM algorithm: From population to sample-based analysis

no code implementations9 Aug 2014 Sivaraman Balakrishnan, Martin J. Wainwright, Bin Yu

Leveraging this characterization, we then provide non-asymptotic guarantees on the EM and gradient EM algorithms when applied to a finite set of samples.

When is it Better to Compare than to Score?

no code implementations25 Jun 2014 Nihar B. Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kannan Ramchandran, Martin Wainwright

When eliciting judgements from humans for an unknown quantity, one often has the choice of making direct-scoring (cardinal) or comparative (ordinal) measurements.

Tight Lower Bounds for Homology Inference

no code implementations29 Jul 2013 Sivaraman Balakrishnan, Alessandro Rinaldo, Aarti Singh, Larry Wasserman

In this note we use a different construction based on the direct analysis of the likelihood ratio test to show that the upper bound of Niyogi, Smale and Weinberger is in fact tight, thus establishing rate optimal asymptotic minimax bounds for the problem.

Cluster Trees on Manifolds

no code implementations NeurIPS 2013 Sivaraman Balakrishnan, Srivatsan Narayanan, Alessandro Rinaldo, Aarti Singh, Larry Wasserman

In this paper we investigate the problem of estimating the cluster tree for a density $f$ supported on or near a smooth $d$-dimensional manifold $M$ isometrically embedded in $\mathbb{R}^D$.

Confidence sets for persistence diagrams

no code implementations28 Mar 2013 Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, Larry Wasserman, Sivaraman Balakrishnan, Aarti Singh

Persistent homology is a method for probing topological properties of point clouds and functions.

Recovering Block-structured Activations Using Compressive Measurements

no code implementations15 Sep 2012 Sivaraman Balakrishnan, Mladen Kolar, Alessandro Rinaldo, Aarti Singh

We consider the problems of detection and localization of a contiguous block of weak activation in a large matrix, from a small number of noisy, possibly adaptive, compressive (linear) measurements.

Minimax Localization of Structural Information in Large Noisy Matrices

no code implementations NeurIPS 2011 Mladen Kolar, Sivaraman Balakrishnan, Alessandro Rinaldo, Aarti Singh

We consider the problem of identifying a sparse set of relevant columns and rows in a large data matrix with highly corrupted entries.

Two-sample testing

Noise Thresholds for Spectral Clustering

no code implementations NeurIPS 2011 Sivaraman Balakrishnan, Min Xu, Akshay Krishnamurthy, Aarti Singh

Although spectral clustering has enjoyed considerable empirical success in machine learning, its theoretical properties are not yet fully developed.

Cannot find the paper you are looking for? You can Submit a new open access paper.