1 code implementation • 22 Mar 2024 • Alec McClean, Sivaraman Balakrishnan, Edward H. Kennedy, Larry Wasserman
Then, assuming the nuisance functions are H\"{o}lder smooth, but without assuming knowledge of the true smoothness level or the covariate density, we establish that DCDR estimators with several linear smoothers are semiparametric efficient under minimal conditions and achieve fast convergence rates in the non-$\sqrt{n}$ regime.
no code implementations • 29 Feb 2024 • Ilmun Kim, Larry Wasserman, Sivaraman Balakrishnan, Matey Neykov
Semi-supervised datasets are ubiquitous across diverse domains where obtaining fully labeled data is costly or time-consuming.
no code implementations • NeurIPS 2023 • Saurabh Garg, Amrith Setlur, Zachary Chase Lipton, Sivaraman Balakrishnan, Virginia Smith, aditi raghunathan
Self-training and contrastive learning have emerged as leading techniques for incorporating unlabeled data, both under distribution shift (unsupervised domain adaptation) and when it is absent (semi-supervised learning).
no code implementations • 6 May 2023 • Sivaraman Balakrishnan, Edward H. Kennedy, Larry Wasserman
These first-order methods are however provably suboptimal in a minimax sense for functional estimation when the nuisance functions live in Holder-type function spaces.
1 code implementation • 6 Feb 2023 • Saurabh Garg, Nick Erickson, James Sharpnack, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton
Despite the emergence of principled methods for domain adaptation under label shift, their sensitivity to shifts in class conditional distributions is precariously under explored.
1 code implementation • 3 Nov 2022 • Helen Zhou, Sivaraman Balakrishnan, Zachary C. Lipton
Rates of missing data often depend on record-keeping policies and thus may change across times and locations, even when the underlying features are comparatively stable.
1 code implementation • 26 Jul 2022 • Saurabh Garg, Sivaraman Balakrishnan, Zachary C. Lipton
We introduce the problem of domain adaptation under Open Set Label Shift (OSLS) where the label distribution can change arbitrarily and a new class may arrive during deployment, but the class-conditional distributions p(x|y) are domain-invariant.
1 code implementation • ICLR 2022 • Saurabh Garg, Sivaraman Balakrishnan, Zachary C. Lipton, Behnam Neyshabur, Hanie Sedghi
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions that may cause performance drops.
no code implementations • 14 Nov 2021 • Alden Green, Sivaraman Balakrishnan, Ryan J. Tibshirani
We also show that PCR-LE is \emph{manifold adaptive}: that is, we consider the situation where the design is supported on a manifold of small intrinsic dimension $m$, and give upper bounds establishing that PCR-LE achieves the faster minimax estimation ($n^{-2s/(2s + m)}$) and testing ($n^{-4s/(4s + m)}$) rates of convergence.
2 code implementations • NeurIPS 2021 • Saurabh Garg, Yifan Wu, Alex Smola, Sivaraman Balakrishnan, Zachary C. Lipton
Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE) -- determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning -- given such an estimate, learning the desired positive-versus-negative classifier.
no code implementations • 25 Aug 2021 • Che-Ping Tsai, Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Ravikumar
We consider the task of heavy-tailed statistical estimation given streaming $p$-dimensional samples.
1 code implementation • 26 Jul 2021 • Tudor Manole, Sivaraman Balakrishnan, Jonathan Niles-Weed, Larry Wasserman
Our work also provides new bounds on the risk of corresponding plugin estimators for the quadratic Wasserstein distance, and we show how this problem relates to that of estimating optimal transport maps using stability arguments for smooth and strongly convex Brenier potentials.
no code implementations • 3 Jun 2021 • Alden Green, Sivaraman Balakrishnan, Ryan J. Tibshirani
In this paper we study the statistical properties of Laplacian smoothing, a graph-based approach to nonparametric regression.
1 code implementation • NeurIPS 2021 • Saurabh Garg, Yifan Wu, Alex Smola, Sivaraman Balakrishnan, Zachary Chase Lipton
Formally, this task is broken down into two subtasks: (i) Mixture Proportion Estimation (MPE)---determining the fraction of positive examples in the unlabeled data; and (ii) PU-learning---given such an estimate, learning the desired positive-versus-negative classifier.
1 code implementation • 1 May 2021 • Saurabh Garg, Sivaraman Balakrishnan, J. Zico Kolter, Zachary C. Lipton
To assess generalization, machine learning scientists typically either (i) bound the generalization gap and then (after training) plug in the empirical risk to obtain a bound on the true risk; or (ii) validate empirically on holdout data.
no code implementations • 20 Feb 2021 • Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, J. Zico Kolter, Zachary C. Lipton, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Pradeep Ravikumar
In this paper, we present a detailed empirical study to characterize the heavy-tailed nature of the gradients of the PPO surrogate reward function.
no code implementations • 1 Jan 2021 • Vishwak Srinivasan, Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Kumar Ravikumar
A dramatic improvement in data collection technologies has aided in procuring massive amounts of unstructured and heterogeneous datasets.
no code implementations • NeurIPS 2020 • Adarsh Prasad, Vishwak Srinivasan, Sivaraman Balakrishnan, Pradeep Ravikumar
We study the problem of learning Ising models in a setting where some of the samples from the underlying distribution can be arbitrarily corrupted.
no code implementations • 21 Jun 2020 • Charvi Rastogi, Sivaraman Balakrishnan, Nihar B. Shah, Aarti Singh
We also provide testing algorithms and associated sample complexity bounds for the problem of two-sample testing with partial (or total) ranking data. Furthermore, we empirically evaluate our results via extensive simulations as well as two real-world datasets consisting of pairwise comparisons.
no code implementations • NeurIPS 2020 • Saurabh Garg, Yifan Wu, Sivaraman Balakrishnan, Zachary C. Lipton
Our contributions include (i) consistency conditions for MLLS, which include calibration of the classifier and a confusion matrix invertibility condition that BBSE also requires; (ii) a unified framework, casting BBSE as roughly equivalent to MLLS for a particular choice of calibration method; and (iii) a decomposition of MLLS's finite-sample error into terms reflecting miscalibration and estimation error.
no code implementations • 24 Dec 2019 • Larry Wasserman, Aaditya Ramdas, Sivaraman Balakrishnan
Constructing tests and confidence sets for such models is notoriously difficult.
no code implementations • 7 Oct 2019 • Purvasha Chakravarti, Sivaraman Balakrishnan, Larry Wasserman
We consider clustering based on significance tests for Gaussian Mixture Models (GMMs).
2 code implementations • 17 Sep 2019 • Tudor Manole, Sivaraman Balakrishnan, Larry Wasserman
To motivate the choice of these classes, we also study minimax rates of estimating a distribution under the Sliced Wasserstein distance.
no code implementations • 2 Aug 2019 • Chirag Gupta, Sivaraman Balakrishnan, Aaditya Ramdas
We derive bounds on the path length $\zeta$ of gradient descent (GD) and gradient flow (GF) curves for various classes of smooth convex and nonconvex functions.
no code implementations • 1 Jul 2019 • Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Ravikumar
Building on this connection, we provide a simple variant of recent computationally-efficient algorithms for mean estimation in Huber's model, which given our connection entails that the same efficient sample-pruning based estimators is simultaneously robust to heavy-tailed noise and Huber contamination.
no code implementations • NeurIPS 2018 • Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan R. Salakhutdinov, Aarti Singh
We show that for an $m$-dimensional convolutional filter with linear activation acting on a $d$-dimensional input, the sample complexity of achieving population prediction error of $\epsilon$ is $\widetilde{O(m/\epsilon^2)$, whereas the sample-complexity for its FNN counterpart is lower bounded by $\Omega(d/\epsilon^2)$ samples.
no code implementations • ICML 2018 • Yichong Xu, Hariank Muthakana, Sivaraman Balakrishnan, Aarti Singh, Artur Dubrawski
Finally, we present experiments that show the efficacy of RR and investigate its robustness to various sources of noise and model-misspecification.
no code implementations • ICML 2018 • Yichong Xu, Sivaraman Balakrishnan, Aarti Singh, Artur Dubrawski
In supervised learning, we typically leverage a fully labeled dataset to design methods for function estimation or prediction.
no code implementations • 26 May 2018 • Simon S. Du, Yining Wang, Sivaraman Balakrishnan, Pradeep Ravikumar, Aarti Singh
We first show that a simple local binning median step can effectively remove the adversary noise and this median estimator is minimax optimal up to absolute constants over the H\"{o}lder function class with smoothness parameters smaller than or equal to 1.
no code implementations • NeurIPS 2018 • Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Aarti Singh
It is widely believed that the practical success of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) owes to the fact that CNNs and RNNs use a more compact parametric representation than their Fully-Connected Neural Network (FNN) counterparts, and consequently require fewer training examples to accurately estimate their parameters.
no code implementations • 22 Apr 2018 • Yo Joong Choe, Sivaraman Balakrishnan, Aarti Singh, Jean M. Vettel, Timothy Verstynen
If communication efficiency is fundamentally constrained by the integrity along the entire length of a white matter bundle, then variability in the functional dynamics of brain networks should be associated with variability in the local connectome.
no code implementations • NeurIPS 2018 • Yining Wang, Sivaraman Balakrishnan, Aarti Singh
In this setup, an algorithm is allowed to adaptively query the underlying function at different locations and receives noisy evaluations of function values at the queried points (i. e. the algorithm has access to zeroth-order information).
no code implementations • 19 Feb 2018 • Adarsh Prasad, Arun Sai Suggala, Sivaraman Balakrishnan, Pradeep Ravikumar
We provide a new computationally-efficient class of estimators for risk minimization.
no code implementations • 17 Dec 2017 • Sivaraman Balakrishnan, Larry Wasserman
The statistical analysis of discrete data has been the subject of extensive statistical research dating back to the work of Pearson.
no code implementations • 29 Oct 2017 • Yining Wang, Simon Du, Sivaraman Balakrishnan, Aarti Singh
We consider the problem of optimizing a high-dimensional convex function using stochastic zeroth-order queries.
no code implementations • 1 Sep 2017 • Nihar B. Shah, Sivaraman Balakrishnan, Martin J. Wainwright
We consider the problem of noisy matrix completion, in which the goal is to reconstruct a structured matrix whose entries are partially observed in noise.
no code implementations • 30 Jun 2017 • Sivaraman Balakrishnan, Larry Wasserman
In contrast to existing results, we show that the minimax rate and critical testing radius in these settings depend strongly, and in a precise way, on the null distribution being tested and this motivates the study of the (local) minimax rate as a function of the null distribution.
no code implementations • 24 Feb 2017 • Simon S. Du, Sivaraman Balakrishnan, Aarti Singh
Many conventional statistical procedures are extremely sensitive to seemingly minor deviations from modeling assumptions.
no code implementations • 9 Feb 2017 • Yining Wang, Jialei Wang, Sivaraman Balakrishnan, Aarti Singh
We consider the problems of estimation and of constructing component-wise confidence intervals in a sparse high-dimensional linear regression model when some covariates of the design matrix are missing completely at random.
no code implementations • NeurIPS 2016 • Chi Jin, Yuchen Zhang, Sivaraman Balakrishnan, Martin J. Wainwright, Michael Jordan
Our first main result shows that the population likelihood function has bad local maxima even in the special case of equally-weighted mixtures of well-separated and spherical Gaussians.
no code implementations • 30 Jun 2016 • Nihar B. Shah, Sivaraman Balakrishnan, Martin J. Wainwright
The task of aggregating and denoising crowd-labeled data has gained increased significance with the advent of crowdsourcing platforms and massive datasets.
no code implementations • 9 Jun 2016 • Christian Kroer, Miroslav Dudík, Sébastien Lahaie, Sivaraman Balakrishnan
We present a new combinatorial market maker that operates arbitrage-free combinatorial prediction markets specified by integer programs.
no code implementations • NeurIPS 2016 • Jisu Kim, Yen-Chi Chen, Sivaraman Balakrishnan, Alessandro Rinaldo, Larry Wasserman
A cluster tree provides a highly-interpretable summary of a density function by representing the hierarchy of its high-density clusters.
no code implementations • 22 Mar 2016 • Nihar B. Shah, Sivaraman Balakrishnan, Martin J. Wainwright
Second, we show that a regularized least squares estimator can achieve a poly-logarithmic adaptivity index, thereby demonstrating a $\sqrt{n}$-gap between optimal and computationally achievable adaptivity.
no code implementations • 27 Dec 2015 • Fanny Yang, Sivaraman Balakrishnan, Martin J. Wainwright
By exploiting this characterization, we provide non-asymptotic finite sample guarantees on the Baum-Welch updates, guaranteeing geometric convergence to a small ball of radius on the order of the minimax rate around a global optimum.
no code implementations • 19 Oct 2015 • Nihar B. Shah, Sivaraman Balakrishnan, Adityanand Guntuboyina, Martin J. Wainwright
On the other hand, unlike in the BTL and Thurstone models, computing the minimax-optimal estimator in the stochastically transitive model is non-trivial, and we explore various computationally tractable alternatives.
no code implementations • 6 May 2015 • Nihar B. Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kannan Ramchandran, Martin J. Wainwright
Data in the form of pairwise comparisons arises in many domains, including preference elicitation, sporting competitions, and peer grading among others.
no code implementations • 9 Aug 2014 • Sivaraman Balakrishnan, Martin J. Wainwright, Bin Yu
Leveraging this characterization, we then provide non-asymptotic guarantees on the EM and gradient EM algorithms when applied to a finite set of samples.
no code implementations • 25 Jun 2014 • Nihar B. Shah, Sivaraman Balakrishnan, Joseph Bradley, Abhay Parekh, Kannan Ramchandran, Martin Wainwright
When eliciting judgements from humans for an unknown quantity, one often has the choice of making direct-scoring (cardinal) or comparative (ordinal) measurements.
no code implementations • 29 Jul 2013 • Sivaraman Balakrishnan, Alessandro Rinaldo, Aarti Singh, Larry Wasserman
In this note we use a different construction based on the direct analysis of the likelihood ratio test to show that the upper bound of Niyogi, Smale and Weinberger is in fact tight, thus establishing rate optimal asymptotic minimax bounds for the problem.
no code implementations • NeurIPS 2013 • Sivaraman Balakrishnan, Srivatsan Narayanan, Alessandro Rinaldo, Aarti Singh, Larry Wasserman
In this paper we investigate the problem of estimating the cluster tree for a density $f$ supported on or near a smooth $d$-dimensional manifold $M$ isometrically embedded in $\mathbb{R}^D$.
no code implementations • 28 Mar 2013 • Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, Larry Wasserman, Sivaraman Balakrishnan, Aarti Singh
Persistent homology is a method for probing topological properties of point clouds and functions.
no code implementations • NeurIPS 2012 • Arthur Gretton, Dino Sejdinovic, Heiko Strathmann, Sivaraman Balakrishnan, Massimiliano Pontil, Kenji Fukumizu, Bharath K. Sriperumbudur
A means of parameter selection for the two-sample test based on the MMD is proposed.
no code implementations • 15 Sep 2012 • Sivaraman Balakrishnan, Mladen Kolar, Alessandro Rinaldo, Aarti Singh
We consider the problems of detection and localization of a contiguous block of weak activation in a large matrix, from a small number of noisy, possibly adaptive, compressive (linear) measurements.
no code implementations • NeurIPS 2011 • Mladen Kolar, Sivaraman Balakrishnan, Alessandro Rinaldo, Aarti Singh
We consider the problem of identifying a sparse set of relevant columns and rows in a large data matrix with highly corrupted entries.
no code implementations • NeurIPS 2011 • Sivaraman Balakrishnan, Min Xu, Akshay Krishnamurthy, Aarti Singh
Although spectral clustering has enjoyed considerable empirical success in machine learning, its theoretical properties are not yet fully developed.