no code implementations • 25 Mar 2024 • Ankit Pensia, Varun Jog, Po-Ling Loh
In this paper, we derive a formula that characterizes the sample complexity (up to multiplicative constants that are independent of $p$, $q$, and all error parameters) for: (i) all $0 \le \alpha, \beta \le 1/8$ in the prior-free setting; and (ii) all $\delta \le \alpha/4$ in the Bayesian setting.
no code implementations • 15 Mar 2024 • Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Ankit Pensia, Thanasis Pittas
Concretely, for Gaussian robust $k$-sparse mean estimation on $\mathbb{R}^d$ with corruption rate $\epsilon>0$, our algorithm has sample complexity $(k^2/\epsilon^2)\mathrm{polylog}(d/\epsilon)$, runs in sample polynomial time, and approximates the target mean within $\ell_2$-error $O(\epsilon)$.
no code implementations • 7 Mar 2024 • Ankit Pensia
We study the algorithmic problem of sparse mean estimation in the presence of adversarial outliers.
no code implementations • 6 Mar 2024 • Arun Jambulapati, Syamantak Kumar, Jerry Li, Shourya Pandey, Ankit Pensia, Kevin Tian
The $k$-principal component analysis ($k$-PCA) problem is a fundamental algorithmic primitive that is widely-used in data analysis and dimensionality reduction applications.
no code implementations • NeurIPS 2023 • Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas
We study the fundamental problems of Gaussian mean estimation and linear regression with Gaussian covariates in the presence of Huber contamination.
no code implementations • 4 May 2023 • Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas
Our main contribution is to develop a nearly-linear time algorithm for robust PCA with near-optimal error guarantees.
no code implementations • 9 Jan 2023 • Ankit Pensia, Amir R. Asadi, Varun Jog, Po-Ling Loh
For the sample complexity of simple hypothesis testing under pure LDP constraints, we establish instance-optimal bounds for distributions with binary support; minimax-optimal bounds for general distributions; and (approximately) instance-optimal, computationally efficient algorithms for general distributions.
no code implementations • 29 Nov 2022 • Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Ankit Pensia
We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity.
no code implementations • 25 Oct 2022 • Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia
Here we give an extremely simple algorithm for Gaussian mean testing with a one-page analysis.
no code implementations • 10 Jun 2022 • Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Ankit Pensia, Thanasis Pittas
We study the problem of list-decodable sparse mean estimation.
no code implementations • 7 Jun 2022 • Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Ankit Pensia, Thanasis Pittas
In this work, we develop the first efficient algorithms for robust sparse mean estimation without a priori knowledge of the covariance.
no code implementations • 6 Jun 2022 • Ankit Pensia, Varun Jog, Po-Ling Loh
We show that the sample complexity of simple binary hypothesis testing under communication constraints is at most a logarithmic factor larger than in the unconstrained setting and this bound is tight.
no code implementations • 26 Apr 2022 • Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas
In this work, we develop the first efficient streaming algorithms for high-dimensional robust statistics with near-optimal memory requirements (up to logarithmic factors).
no code implementations • NeurIPS 2021 • Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas, Alistair Stewart
We study the problem of list-decodable linear regression, where an adversary can corrupt a majority of the examples.
1 code implementation • NeurIPS 2020 • Ankit Pensia, Shashank Rajput, Alliot Nagle, Harit Vishwakarma, Dimitris Papailiopoulos
We show that any target network of width $d$ and depth $l$ can be approximated by pruning a random network that is a factor $O(log(dl))$ wider and twice as deep.
no code implementations • 27 Sep 2020 • Ankit Pensia, Varun Jog, Po-Ling Loh
We study the problem of linear regression where both covariates and responses are potentially (i) heavy-tailed and (ii) adversarially contaminated.
no code implementations • NeurIPS 2020 • Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia
We study the problem of outlier robust high-dimensional mean estimation under a finite covariance assumption, and more broadly under finite low-degree moment assumptions.
1 code implementation • 14 Jun 2020 • Ankit Pensia, Shashank Rajput, Alliot Nagle, Harit Vishwakarma, Dimitris Papailiopoulos
We show that any target network of width $d$ and depth $l$ can be approximated by pruning a random network that is a factor $O(\log(dl))$ wider and twice as deep.
no code implementations • 15 Oct 2019 • Ankit Pensia, Varun Jog, Po-Ling Loh
We propose a novel strategy for extracting features in supervised learning that can be used to construct a classifier which is more robust to small perturbations in the input space.
no code implementations • 6 Jul 2019 • Ankit Pensia, Varun Jog, Po-Ling Loh
In the multivariate setting, we generalize our theory to mean estimation for mixtures of radially symmetric distributions, and derive minimax lower bounds on the expected error of any estimator that is agnostic to the scales of individual data points.
no code implementations • 13 Apr 2019 • Rajat Panda, Ankit Pensia, Nikhil Mehta, Mingyuan Zhou, Piyush Rai
We present a probabilistic framework for multi-label learning based on a deep generative model for the binary label vector associated with each observation.
no code implementations • 12 Jan 2018 • Ankit Pensia, Varun Jog, Po-Ling Loh
In statistical learning theory, generalization error is used to quantify the degree to which a supervised machine learning algorithm may overfit to training data.