Search Results for author: Sivaraman Balakrishnan

Found 56 papers, 10 papers with code

Double Cross-fit Doubly Robust Estimators: Beyond Series Regression

1 code implementation22 Mar 2024 Alec McClean, Sivaraman Balakrishnan, Edward H. Kennedy, Larry Wasserman

Then, assuming the nuisance functions are H\"{o}lder smooth, but without assuming knowledge of the true smoothness level or the covariate density, we establish that DCDR estimators with several linear smoothers are semiparametric efficient under minimal conditions and achieve fast convergence rates in the non-$\sqrt{n}$ regime.

Causal Inference regression

Semi-Supervised U-statistics

no code implementations29 Feb 2024 Ilmun Kim, Larry Wasserman, Sivaraman Balakrishnan, Matey Neykov

Semi-supervised datasets are ubiquitous across diverse domains where obtaining fully labeled data is costly or time-consuming.

Complementary Benefits of Contrastive Learning and Self-Training Under Distribution Shift

no code implementations NeurIPS 2023 Saurabh Garg, Amrith Setlur, Zachary Chase Lipton, Sivaraman Balakrishnan, Virginia Smith, aditi raghunathan

Self-training and contrastive learning have emerged as leading techniques for incorporating unlabeled data, both under distribution shift (unsupervised domain adaptation) and when it is absent (semi-supervised learning).

Contrastive Learning Unsupervised Domain Adaptation

The Fundamental Limits of Structure-Agnostic Functional Estimation

no code implementations6 May 2023 Sivaraman Balakrishnan, Edward H. Kennedy, Larry Wasserman

These first-order methods are however provably suboptimal in a minimax sense for functional estimation when the nuisance functions live in Holder-type function spaces.

Causal Inference

RLSbench: Domain Adaptation Under Relaxed Label Shift

1 code implementation6 Feb 2023 Saurabh Garg, Nick Erickson, James Sharpnack, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton

Despite the emergence of principled methods for domain adaptation under label shift, their sensitivity to shifts in class conditional distributions is precariously under explored.

Domain Adaptation

Domain Adaptation under Missingness Shift

1 code implementation3 Nov 2022 Helen Zhou, Sivaraman Balakrishnan, Zachary C. Lipton

Rates of missing data often depend on record-keeping policies and thus may change across times and locations, even when the underlying features are comparatively stable.

Domain Adaptation

Domain Adaptation under Open Set Label Shift

1 code implementation26 Jul 2022 Saurabh Garg, Sivaraman Balakrishnan, Zachary C. Lipton

We introduce the problem of domain adaptation under Open Set Label Shift (OSLS) where the label distribution can change arbitrarily and a new class may arrive during deployment, but the class-conditional distributions p(x|y) are domain-invariant.

Domain Adaptation

Leveraging Unlabeled Data to Predict Out-of-Distribution Performance

1 code implementation ICLR 2022 Saurabh Garg, Sivaraman Balakrishnan, Zachary C. Lipton, Behnam Neyshabur, Hanie Sedghi

Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions that may cause performance drops.

Minimax Optimal Regression over Sobolev Spaces via Laplacian Eigenmaps on Neighborhood Graphs

no code implementations14 Nov 2021 Alden Green, Sivaraman Balakrishnan, Ryan J. Tibshirani

We also show that PCR-LE is \emph{manifold adaptive}: that is, we consider the situation where the design is supported on a manifold of small intrinsic dimension $m$, and give upper bounds establishing that PCR-LE achieves the faster minimax estimation ($n^{-2s/(2s + m)}$) and testing ($n^{-4s/(4s + m)}$) rates of convergence.

regression

Mixture Proportion Estimation and PU Learning: A Modern Approach

2 code implementations NeurIPS 2021 Saurabh Garg, Yifan Wu, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton

Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE) -- determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning -- given such an estimate, learning the desired positive-versus-negative classifier.

Heavy-tailed Streaming Statistical Estimation

no code implementations25 Aug 2021 Che-Ping Tsai, Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Ravikumar

We consider the task of heavy-tailed statistical estimation given streaming $p$-dimensional samples.

regression Stochastic Optimization

Plugin Estimation of Smooth Optimal Transport Maps

1 code implementation26 Jul 2021 Tudor Manole, Sivaraman Balakrishnan, Jonathan Niles-Weed, Larry Wasserman

Our work also provides new bounds on the risk of corresponding plugin estimators for the quadratic Wasserstein distance, and we show how this problem relates to that of estimating optimal transport maps using stability arguments for smooth and strongly convex Brenier potentials.

Minimax Optimal Regression over Sobolev Spaces via Laplacian Regularization on Neighborhood Graphs

no code implementations3 Jun 2021 Alden Green, Sivaraman Balakrishnan, Ryan J. Tibshirani

In this paper we study the statistical properties of Laplacian smoothing, a graph-based approach to nonparametric regression.

regression

Mixture Proportion Estimation and PU Learning:A Modern Approach

1 code implementation NeurIPS 2021 Saurabh Garg, Yifan Wu, Alex Smola, Sivaraman Balakrishnan, Zachary Chase Lipton

Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE)---determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning---given such an estimate, learning the desired positive-versus-negative classifier.

RATT: Leveraging Unlabeled Data to Guarantee Generalization

1 code implementation1 May 2021 Saurabh Garg, Sivaraman Balakrishnan, J. Zico Kolter, Zachary C. Lipton

To assess generalization, machine learning scientists typically either (i) bound the generalization gap and then (after training) plug in the empirical risk to obtain a bound on the true risk; or (ii) validate empirically on holdout data.

Generalization Bounds Holdout Set +1

On Proximal Policy Optimization's Heavy-tailed Gradients

no code implementations20 Feb 2021 Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, J. Zico Kolter, Zachary C. Lipton, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Pradeep Ravikumar

In this paper, we present a detailed empirical study to characterize the heavy-tailed nature of the gradients of the PPO surrogate reward function.

Continuous Control

Efficient Estimators for Heavy-Tailed Machine Learning

no code implementations1 Jan 2021 Vishwak Srinivasan, Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Kumar Ravikumar

A dramatic improvement in data collection technologies has aided in procuring massive amounts of unstructured and heterogeneous datasets.

BIG-bench Machine Learning

On Learning Ising Models under Huber's Contamination Model

no code implementations NeurIPS 2020 Adarsh Prasad, Vishwak Srinivasan, Sivaraman Balakrishnan, Pradeep Ravikumar

We study the problem of learning Ising models in a setting where some of the samples from the underlying distribution can be arbitrarily corrupted.

Two-Sample Testing on Ranked Preference Data and the Role of Modeling Assumptions

no code implementations21 Jun 2020 Charvi Rastogi, Sivaraman Balakrishnan, Nihar B. Shah, Aarti Singh

We also provide testing algorithms and associated sample complexity bounds for the problem of two-sample testing with partial (or total) ranking data. Furthermore, we empirically evaluate our results via extensive simulations as well as two real-world datasets consisting of pairwise comparisons.

Two-sample testing

A Unified View of Label Shift Estimation

no code implementations NeurIPS 2020 Saurabh Garg, Yifan Wu, Sivaraman Balakrishnan, Zachary C. Lipton

Our contributions include (i) consistency conditions for MLLS, which include calibration of the classifier and a confusion matrix invertibility condition that BBSE also requires; (ii) a unified framework, casting BBSE as roughly equivalent to MLLS for a particular choice of calibration method; and (iii) a decomposition of MLLS's finite-sample error into terms reflecting miscalibration and estimation error.

Universal Inference

no code implementations24 Dec 2019 Larry Wasserman, Aaditya Ramdas, Sivaraman Balakrishnan

Constructing tests and confidence sets for such models is notoriously difficult.

valid

Minimax Confidence Intervals for the Sliced Wasserstein Distance

2 code implementations17 Sep 2019 Tudor Manole, Sivaraman Balakrishnan, Larry Wasserman

To motivate the choice of these classes, we also study minimax rates of estimating a distribution under the Sliced Wasserstein distance.

Uncertainty Quantification

Path Length Bounds for Gradient Descent and Flow

no code implementations2 Aug 2019 Chirag Gupta, Sivaraman Balakrishnan, Aaditya Ramdas

We derive bounds on the path length $\zeta$ of gradient descent (GD) and gradient flow (GF) curves for various classes of smooth convex and nonconvex functions.

A Unified Approach to Robust Mean Estimation

no code implementations1 Jul 2019 Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Ravikumar

Building on this connection, we provide a simple variant of recent computationally-efficient algorithms for mean estimation in Huber's model, which given our connection entails that the same efficient sample-pruning based estimators is simultaneously robust to heavy-tailed noise and Huber contamination.

How Many Samples are Needed to Estimate a Convolutional Neural Network?

no code implementations NeurIPS 2018 Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan R. Salakhutdinov, Aarti Singh

We show that for an $m$-dimensional convolutional filter with linear activation acting on a $d$-dimensional input, the sample complexity of achieving population prediction error of $\epsilon$ is $\widetilde{O(m/\epsilon^2)$, whereas the sample-complexity for its FNN counterpart is lower bounded by $\Omega(d/\epsilon^2)$ samples.

LEMMA

Nonparametric Regression with Comparisons: Escaping the Curse of Dimensionality with Ordinal Information

no code implementations ICML 2018 Yichong Xu, Hariank Muthakana, Sivaraman Balakrishnan, Aarti Singh, Artur Dubrawski

Finally, we present experiments that show the efficacy of RR and investigate its robustness to various sources of noise and model-misspecification.

regression

Regression with Comparisons: Escaping the Curse of Dimensionality with Ordinal Information

no code implementations ICML 2018 Yichong Xu, Sivaraman Balakrishnan, Aarti Singh, Artur Dubrawski

In supervised learning, we typically leverage a fully labeled dataset to design methods for function estimation or prediction.

regression

Robust Nonparametric Regression under Huber's $ε$-contamination Model

no code implementations26 May 2018 Simon S. Du, Yining Wang, Sivaraman Balakrishnan, Pradeep Ravikumar, Aarti Singh

We first show that a simple local binning median step can effectively remove the adversary noise and this median estimator is minimax optimal up to absolute constants over the H\"{o}lder function class with smoothness parameters smaller than or equal to 1.

regression

How Many Samples are Needed to Estimate a Convolutional or Recurrent Neural Network?

no code implementations NeurIPS 2018 Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Aarti Singh

It is widely believed that the practical success of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) owes to the fact that CNNs and RNNs use a more compact parametric representation than their Fully-Connected Neural Network (FNN) counterparts, and consequently require fewer training examples to accurately estimate their parameters.

LEMMA

Local White Matter Architecture Defines Functional Brain Dynamics

no code implementations22 Apr 2018 Yo Joong Choe, Sivaraman Balakrishnan, Aarti Singh, Jean M. Vettel, Timothy Verstynen

If communication efficiency is fundamentally constrained by the integrity along the entire length of a white matter bundle, then variability in the functional dynamics of brain networks should be associated with variability in the local connectome.

Variable Selection

Optimization of Smooth Functions with Noisy Observations: Local Minimax Rates

no code implementations NeurIPS 2018 Yining Wang, Sivaraman Balakrishnan, Aarti Singh

In this setup, an algorithm is allowed to adaptively query the underlying function at different locations and receives noisy evaluations of function values at the queried points (i. e. the algorithm has access to zeroth-order information).

Robust Estimation via Robust Gradient Estimation

no code implementations19 Feb 2018 Adarsh Prasad, Arun Sai Suggala, Sivaraman Balakrishnan, Pradeep Ravikumar

We provide a new computationally-efficient class of estimators for risk minimization.

regression

Hypothesis Testing for High-Dimensional Multinomials: A Selective Review

no code implementations17 Dec 2017 Sivaraman Balakrishnan, Larry Wasserman

The statistical analysis of discrete data has been the subject of extensive statistical research dating back to the work of Pearson.

Two-sample testing Vocal Bursts Intensity Prediction

Stochastic Zeroth-order Optimization in High Dimensions

no code implementations29 Oct 2017 Yining Wang, Simon Du, Sivaraman Balakrishnan, Aarti Singh

We consider the problem of optimizing a high-dimensional convex function using stochastic zeroth-order queries.

feature selection Vocal Bursts Intensity Prediction

Low Permutation-rank Matrices: Structural Properties and Noisy Completion

no code implementations1 Sep 2017 Nihar B. Shah, Sivaraman Balakrishnan, Martin J. Wainwright

We consider the problem of noisy matrix completion, in which the goal is to reconstruct a structured matrix whose entries are partially observed in noise.

Matrix Completion

Hypothesis Testing For Densities and High-Dimensional Multinomials: Sharp Local Minimax Rates

no code implementations30 Jun 2017 Sivaraman Balakrishnan, Larry Wasserman

In contrast to existing results, we show that the minimax rate and critical testing radius in these settings depend strongly, and in a precise way, on the null distribution being tested and this motivates the study of the (local) minimax rate as a function of the null distribution.

Two-sample testing

Computationally Efficient Robust Estimation of Sparse Functionals

no code implementations24 Feb 2017 Simon S. Du, Sivaraman Balakrishnan, Aarti Singh

Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions.

regression

Rate Optimal Estimation and Confidence Intervals for High-dimensional Regression with Missing Covariates

no code implementations9 Feb 2017 Yining Wang, Jialei Wang, Sivaraman Balakrishnan, Aarti Singh

We consider the problems of estimation and of constructing component-wise confidence intervals in a sparse high-dimensional linear regression model when some covariates of the design matrix are missing completely at random.

Missing Values regression

Local Maxima in the Likelihood of Gaussian Mixture Models: Structural Results and Algorithmic Consequences

no code implementations NeurIPS 2016 Chi Jin, Yuchen Zhang, Sivaraman Balakrishnan, Martin J. Wainwright, Michael Jordan

Our first main result shows that the population likelihood function has bad local maxima even in the special case of equally-weighted mixtures of well-separated and spherical Gaussians.

Open-Ended Question Answering

A Permutation-based Model for Crowd Labeling: Optimal Estimation and Robustness

no code implementations30 Jun 2016 Nihar B. Shah, Sivaraman Balakrishnan, Martin J. Wainwright

The task of aggregating and denoising crowd-labeled data has gained increased significance with the advent of crowdsourcing platforms and massive datasets.

Denoising

Arbitrage-Free Combinatorial Market Making via Integer Programming

no code implementations9 Jun 2016 Christian Kroer, Miroslav Dudík, Sébastien Lahaie, Sivaraman Balakrishnan

We present a new combinatorial market maker that operates arbitrage-free combinatorial prediction markets specified by integer programs.

Statistical Inference for Cluster Trees

no code implementations NeurIPS 2016 Jisu Kim, Yen-Chi Chen, Sivaraman Balakrishnan, Alessandro Rinaldo, Larry Wasserman

A cluster tree provides a highly-interpretable summary of a density function by representing the hierarchy of its high-density clusters.

Feeling the Bern: Adaptive Estimators for Bernoulli Probabilities of Pairwise Comparisons

no code implementations22 Mar 2016 Nihar B. Shah, Sivaraman Balakrishnan, Martin J. Wainwright

Second, we show that a regularized least squares estimator can achieve a poly-logarithmic adaptivity index, thereby demonstrating a $\sqrt{n}$-gap between optimal and computationally achievable adaptivity.

Statistical and Computational Guarantees for the Baum-Welch Algorithm

no code implementations27 Dec 2015 Fanny Yang, Sivaraman Balakrishnan, Martin J. Wainwright

By exploiting this characterization, we provide non-asymptotic finite sample guarantees on the Baum-Welch updates, guaranteeing geometric convergence to a small ball of radius on the order of the minimax rate around a global optimum.

Econometrics speech-recognition +3

Stochastically Transitive Models for Pairwise Comparisons: Statistical and Computational Issues

no code implementations19 Oct 2015 Nihar B. Shah, Sivaraman Balakrishnan, Adityanand Guntuboyina, Martin J. Wainwright

On the other hand, unlike in the BTL and Thurstone models, computing the minimax-optimal estimator in the stochastically transitive model is non-trivial, and we explore various computationally tractable alternatives.

Estimation from Pairwise Comparisons: Sharp Minimax Bounds with Topology Dependence

no code implementations6 May 2015 Nihar B. Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kannan Ramchandran, Martin J. Wainwright

Data in the form of pairwise comparisons arises in many domains, including preference elicitation, sporting competitions, and peer grading among others.

Statistical guarantees for the EM algorithm: From population to sample-based analysis

no code implementations9 Aug 2014 Sivaraman Balakrishnan, Martin J. Wainwright, Bin Yu

Leveraging this characterization, we then provide non-asymptotic guarantees on the EM and gradient EM algorithms when applied to a finite set of samples.

When is it Better to Compare than to Score?

no code implementations25 Jun 2014 Nihar B. Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kannan Ramchandran, Martin Wainwright

When eliciting judgements from humans for an unknown quantity, one often has the choice of making direct-scoring (cardinal) or comparative (ordinal) measurements.

Tight Lower Bounds for Homology Inference

no code implementations29 Jul 2013 Sivaraman Balakrishnan, Alessandro Rinaldo, Aarti Singh, Larry Wasserman

In this note we use a different construction based on the direct analysis of the likelihood ratio test to show that the upper bound of Niyogi, Smale and Weinberger is in fact tight, thus establishing rate optimal asymptotic minimax bounds for the problem.

LEMMA

Cluster Trees on Manifolds

no code implementations NeurIPS 2013 Sivaraman Balakrishnan, Srivatsan Narayanan, Alessandro Rinaldo, Aarti Singh, Larry Wasserman

In this paper we investigate the problem of estimating the cluster tree for a density $f$ supported on or near a smooth $d$-dimensional manifold $M$ isometrically embedded in $\mathbb{R}^D$.

Clustering

Confidence sets for persistence diagrams

no code implementations28 Mar 2013 Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, Larry Wasserman, Sivaraman Balakrishnan, Aarti Singh

Persistent homology is a method for probing topological properties of point clouds and functions.

Recovering Block-structured Activations Using Compressive Measurements

no code implementations15 Sep 2012 Sivaraman Balakrishnan, Mladen Kolar, Alessandro Rinaldo, Aarti Singh

We consider the problems of detection and localization of a contiguous block of weak activation in a large matrix, from a small number of noisy, possibly adaptive, compressive (linear) measurements.

Minimax Localization of Structural Information in Large Noisy Matrices

no code implementations NeurIPS 2011 Mladen Kolar, Sivaraman Balakrishnan, Alessandro Rinaldo, Aarti Singh

We consider the problem of identifying a sparse set of relevant columns and rows in a large data matrix with highly corrupted entries.

Clustering Two-sample testing

Noise Thresholds for Spectral Clustering

no code implementations NeurIPS 2011 Sivaraman Balakrishnan, Min Xu, Akshay Krishnamurthy, Aarti Singh

Although spectral clustering has enjoyed considerable empirical success in machine learning, its theoretical properties are not yet fully developed.

Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.