no code implementations • 30 Aug 2024 • Ilias Diakonikolas, Daniel M. Kane, Sihan Liu, Nikos Zarifis
We study the task of testable learning of general -- not necessarily homogeneous -- halfspaces with adversarial label noise with respect to the Gaussian distribution.
no code implementations • 31 Mar 2024 • Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Sihan Liu, Nikos Zarifis
We study the efficient learnability of low-degree polynomial threshold functions (PTFs) in the presence of a constant fraction of adversarial corruptions.
no code implementations • 15 Mar 2024 • Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Ankit Pensia, Thanasis Pittas
Concretely, for Gaussian robust $k$-sparse mean estimation on $\mathbb{R}^d$ with corruption rate $\epsilon>0$, our algorithm has sample complexity $(k^2/\epsilon^2)\mathrm{polylog}(d/\epsilon)$, runs in sample polynomial time, and approximates the target mean within $\ell_2$-error $O(\epsilon)$.
no code implementations • 4 Mar 2024 • Ilias Diakonikolas, Daniel M. Kane, Thanasis Pittas, Nikos Zarifis
We study the problem of estimating the mean of an identity covariance Gaussian in the truncated setting, in the regime when the truncation set comes from a low-complexity family $\mathcal{C}$ of sets.
no code implementations • 27 Dec 2023 • Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
In contrast, algorithms that rely only on random examples inherently require $d^{\mathrm{poly}(1/\epsilon)}$ samples and runtime, even for the basic problem of agnostically learning a single ReLU or a halfspace.
no code implementations • 19 Dec 2023 • Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Thanasis Pittas
Furthermore, under a variant of the "no large sub-cluster'' condition from in prior work [BKK22], we show that our algorithm outputs an accurate clustering, not just a refinement, even for general-weight mixtures.
no code implementations • NeurIPS 2023 • Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas
We study the fundamental problems of Gaussian mean estimation and linear regression with Gaussian covariates in the presence of Huber contamination.
no code implementations • 22 Nov 2023 • Ilias Diakonikolas, Daniel M. Kane, Sihan Liu
Our main result is the first closeness tester for this problem with {\em sub-learning} sample complexity in any fixed dimension and a nearly-matching sample complexity lower bound.
no code implementations • 24 Oct 2023 • Daniel M. Kane, Ilias Diakonikolas, Hanshen Xiao, Sihan Liu
We note that if the algorithm is allowed to wait until time $T$ to report its estimate, this reduces to the well-studied problem of robust mean estimation.
no code implementations • 31 Jul 2023 • Yuqian Cheng, Daniel M. Kane, Zhicheng Zheng
We develop a new technique for proving distribution testing lower bounds for properties defined by inequalities involving the bin probabilities of the distribution in question.
no code implementations • 24 Jul 2023 • Ilias Diakonikolas, Daniel M. Kane
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/\epsilon)^{O(k)}$, where $\epsilon>0$ is the target accuracy.
no code implementations • 28 Jun 2023 • Ilias Diakonikolas, Jelena Diakonikolas, Daniel M. Kane, Puqian Wang, Nikos Zarifis
Our main result is a lower bound for Statistical Query (SQ) algorithms and low-degree polynomial tests suggesting that the quadratic dependence on $1/\epsilon$ in the sample complexity is inherent for computationally efficient algorithms.
no code implementations • 22 Jun 2023 • Ilias Diakonikolas, Daniel M. Kane, Thanasis Pittas, Nikos Zarifis
In the special case where the separation is on the order of $k^{1/2}$, we additionally obtain fine-grained SQ lower bounds with the correct exponent.
no code implementations • 4 May 2023 • Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas
Our main contribution is to develop a nearly-linear time algorithm for robust PCA with near-optimal error guarantees.
no code implementations • 13 Feb 2023 • Ilias Diakonikolas, Daniel M. Kane, Lisheng Ren
We study the task of agnostically learning halfspaces under the Gaussian distribution.
no code implementations • 13 Feb 2023 • Max Hopkins, Daniel M. Kane, Shachar Lovett, Gaurav Mahajan
We study a foundational variant of Valiant and Vapnik and Chervonenkis' Probably Approximately Correct (PAC)-Learning in which the adversary is restricted to a known family of marginal distributions $\mathscr{P}$.
no code implementations • 21 Dec 2022 • Daniel M. Kane, Ilias Diakonikolas
We prove that for $c>0$ a sufficiently small universal constant that a random set of $c d^2/\log^4(d)$ independent Gaussian random points in $\mathbb{R}^d$ lie on a common ellipsoid with high probability.
no code implementations • 6 Dec 2022 • Ilias Diakonikolas, Christos Tzamos, Daniel M. Kane
By leveraging our strongly polynomial Forster algorithm, we obtain the first strongly polynomial time algorithm for {\em distribution-free} PAC learning of halfspaces.
no code implementations • 29 Nov 2022 • Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Ankit Pensia
We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity.
no code implementations • 25 Oct 2022 • Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia
Here we give an extremely simple algorithm for Gaussian mean testing with a one-page analysis.
no code implementations • 18 Oct 2022 • Ilias Diakonikolas, Daniel M. Kane, Lisheng Ren, Yuxin Sun
We study the problem of PAC learning a single neuron in the presence of Massart noise.
no code implementations • 28 Jul 2022 • Ilias Diakonikolas, Daniel M. Kane, Pasin Manurangsi, Lisheng Ren
We study the complexity of PAC learning halfspaces in the presence of Massart noise.
no code implementations • 14 Jul 2022 • Clément L. Canonne, Ilias Diakonikolas, Daniel M. Kane, Sihan Liu
We investigate the problem of testing whether a discrete probability distribution over an ordered domain is a histogram on a specified number of bins.
no code implementations • 10 Jun 2022 • Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Ankit Pensia, Thanasis Pittas
We study the problem of list-decodable sparse mean estimation.
no code implementations • 9 Jun 2022 • Ilias Diakonikolas, Daniel M. Kane, Yuxin Sun
We establish optimal Statistical Query (SQ) lower bounds for robustly learning certain families of discrete high-dimensional distributions.
no code implementations • 7 Jun 2022 • Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Ankit Pensia, Thanasis Pittas
In this work, we develop the first efficient algorithms for robust sparse mean estimation without a priori knowledge of the covariance.
no code implementations • 26 Apr 2022 • Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas
In this work, we develop the first efficient streaming algorithms for high-dimensional robust statistics with near-optimal memory requirements (up to logarithmic factors).
no code implementations • 16 Dec 2021 • Ilias Diakonikolas, Daniel M. Kane
Non-Gaussian Component Analysis (NGCA) is the following distribution learning problem: Given i. i. d.
no code implementations • 8 Nov 2021 • Max Hopkins, Daniel M. Kane, Shachar Lovett, Gaurav Mahajan
The equivalence of realizable and agnostic learnability is a fundamental phenomenon in learning theory.
1 code implementation • 23 Sep 2021 • Yu Cheng, Ilias Diakonikolas, Rong Ge, Shivam Gupta, Daniel M. Kane, Mahdi Soltanolkotabi
We explore the connection between outlier-robust high-dimensional statistics and non-convex optimization in the presence of sparsity constraints, with a focus on the fundamental tasks of robust sparse mean estimation and robust sparse PCA.
no code implementations • 19 Aug 2021 • Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
We study the general problem and establish the following: For $\eta <1/2$, we give a learning algorithm for general halfspaces with sample and computational complexity $d^{O_{\eta}(\log(1/\gamma))}\mathrm{poly}(1/\epsilon)$, where $\gamma =\max\{\epsilon, \min\{\mathbf{Pr}[f(\mathbf{x}) = 1], \mathbf{Pr}[f(\mathbf{x}) = -1]\} \}$ is the bias of the target halfspace $f$.
no code implementations • NeurIPS 2021 • Ilias Diakonikolas, Daniel M. Kane, Christos Tzamos
A Forster transform is an operation that turns a distribution into one with good anti-concentration properties.
no code implementations • NeurIPS 2021 • Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas, Alistair Stewart
We study the problem of list-decodable linear regression, where an adversary can corrupt a majority of the examples.
no code implementations • 16 Jun 2021 • Ilias Diakonikolas, Daniel M. Kane, Daniel Kongsgaard, Jerry Li, Kevin Tian
We leverage this result, together with additional techniques, to obtain the first almost-linear time algorithms for clustering mixtures of $k$ separated well-behaved distributions, nearly-matching the statistical guarantees of spectral methods.
no code implementations • 10 Feb 2021 • Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
We study the problem of agnostically learning halfspaces under the Gaussian distribution.
no code implementations • 8 Feb 2021 • Ilias Diakonikolas, Daniel M. Kane, Thanasis Pittas, Nikos Zarifis
We study the problem of agnostic learning under the Gaussian distribution.
no code implementations • 3 Feb 2021 • Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart, Yuxin Sun
We study the problem of learning Ising models satisfying Dobrushin's condition in the outlier-robust setting where a constant fraction of the samples are adversarially corrupted.
no code implementations • 31 Dec 2020 • Ilias Diakonikolas, Daniel M. Kane
This lower bound is best possible, as $O(d^2)$ samples suffice to even robustly {\em learn} the covariance.
no code implementations • 17 Dec 2020 • Ilias Diakonikolas, Daniel M. Kane
The best known $\mathrm{poly}(d, 1/\epsilon)$-time algorithms for this problem achieve error of $\eta+\epsilon$, which can be far from the optimal bound of $\mathrm{OPT}+\epsilon$, where $\mathrm{OPT} = \mathbf{E}_{x \sim D_x} [\eta(x)]$.
no code implementations • 14 Dec 2020 • Ilias Diakonikolas, Daniel M. Kane
Our result is constructive yielding an algorithm to compute such an $\epsilon$-cover that runs in time $\mathrm{poly}(M)$.
no code implementations • 3 Dec 2020 • Ainesh Bakshi, Ilias Diakonikolas, He Jia, Daniel M. Kane, Pravesh K. Kothari, Santosh S. Vempala
We give a polynomial-time algorithm for the problem of robustly estimating a mixture of $k$ arbitrary Gaussians in $\mathbb{R}^d$, for any fixed $k$, in the presence of a constant fraction of arbitrary corruptions.
no code implementations • NeurIPS 2021 • Ilias Diakonikolas, Daniel M. Kane, Daniel Kongsgaard, Jerry Li, Kevin Tian
Our algorithm runs in time $\widetilde{O}(ndk)$ for all $k = O(\sqrt{d}) \cup \Omega(d)$, where $n$ is the size of the dataset.
no code implementations • 4 Oct 2020 • Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis
{\em We give the first polynomial-time algorithm for this fundamental learning problem.}
no code implementations • 14 Sep 2020 • Ilias Diakonikolas, Themis Gouleakis, Daniel M. Kane, John Peebles, Eric Price
To illustrate the generality of our methods, we give optimal algorithms for testing collections of distributions and testing closeness with unequal sized samples.
no code implementations • NeurIPS 2020 • Ilias Diakonikolas, Daniel M. Kane, Pasin Manurangsi
We study the computational complexity of adversarially robust proper learning of halfspaces in the distribution-independent agnostic PAC model, with a focus on $L_p$ perturbations.
no code implementations • NeurIPS 2020 • Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia
We study the problem of outlier robust high-dimensional mean estimation under a finite covariance assumption, and more broadly under finite low-degree moment assumptions.
no code implementations • 12 Jul 2020 • Daniel M. Kane
We resolve one of the major outstanding problems in robust statistics.
no code implementations • NeurIPS 2020 • Ilias Diakonikolas, Daniel M. Kane, Nikos Zarifis
We study the fundamental problems of agnostically learning halfspaces and ReLUs under Gaussian marginals.
no code implementations • 22 Jun 2020 • Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Nikos Zarifis
For the case of positive coefficients, we give the first polynomial-time algorithm for this learning problem for $k$ up to $\tilde{O}(\sqrt{\log d})$.
no code implementations • NeurIPS 2020 • Ilias Diakonikolas, Daniel M. Kane, Daniel Kongsgaard
We study the problem of {\em list-decodable mean estimation} for bounded covariance distributions.
no code implementations • 23 Apr 2020 • Max Hopkins, Daniel M. Kane, Shachar Lovett, Gaurav Mahajan
Given a finite set $X \subset \mathbb{R}^d$ and a binary linear classifier $c: \mathbb{R}^d \to \{0, 1\}$, how many queries of the form $c(x)$ are required to learn the label of every point in $X$?
no code implementations • 14 Nov 2019 • Ilias Diakonikolas, Daniel M. Kane
Learning in the presence of outliers is a fundamental problem in statistics.
no code implementations • NeurIPS 2019 • Ilias Diakonikolas, Daniel M. Kane, Pasin Manurangsi
We study the problem of {\em properly} learning large margin halfspaces in the agnostic PAC model.
no code implementations • NeurIPS 2020 • Max Hopkins, Daniel M. Kane, Shachar Lovett
While previous results show that active learning performs no better than its supervised alternative for important concept classes such as linear separators, we show that by adding weak distributional assumptions and allowing comparison queries, active learning requires exponentially fewer samples.
no code implementations • 11 Jun 2019 • Ilias Diakonikolas, Themis Gouleakis, Daniel M. Kane, Sankeerth Rao
We study distribution testing with communication and memory constraints in the following computational models: (1) The {\em one-pass streaming model} where the goal is to minimize the sample complexity of the protocol subject to a memory constraint, and (2) A {\em distributed model} where the data samples reside at multiple machines and the goal is to minimize the communication cost of the protocol.
no code implementations • 13 Feb 2019 • Surbhi Goel, Daniel M. Kane, Adam R. Klivans
We give the first efficient algorithm for learning the structure of an Ising model that tolerates independent failures; that is, each entry of the observed sample is missing with some unknown probability p. Our algorithm matches the essentially optimal runtime and sample complexity bounds of recent work for learning Ising models due to Klivans and Meka (2017).
no code implementations • 7 Nov 2018 • Ilias Diakonikolas, Daniel M. Kane
Our robust identifiability result gives the following algorithmic applications: First, we show that Boolean degree-$d$ PTFs can be efficiently approximately reconstructed from approximations to their degree-$d$ Chow parameters.
no code implementations • 10 Apr 2018 • Ilias Diakonikolas, Daniel M. Kane, John Peebles
We give the first identity tester for this problem with {\em sub-learning} sample complexity in any fixed dimension and a nearly-matching sample complexity lower bound.
1 code implementation • 7 Mar 2018 • Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Jacob Steinhardt, Alistair Stewart
In high dimensions, most machine learning methods are brittle to even a small fraction of structured outliers.
no code implementations • 20 Nov 2017 • Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart
We give a learning algorithm for mixtures of spherical Gaussians that succeeds under significantly weaker separation assumptions compared to prior work.
no code implementations • 16 Nov 2017 • Daniel M. Kane, Roi Livni, Shay Moran, Amir Yehudayoff
To naturally fit into the framework of learning theory, the players can send each other examples (as well as bits) where each example/bit costs one unit of communication.
no code implementations • NeurIPS 2018 • Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart
We study the problem of generalized uniformity testing \cite{BC17} of a discrete probability distribution: Given samples from a probability distribution $p$ over an {\em unknown} discrete domain $\mathbf{\Omega}$, we want to distinguish, with probability at least $2/3$, between the case that $p$ is uniform on some {\em subset} of $\mathbf{\Omega}$ versus $\epsilon$-far, in total variation distance, from any such uniform distribution.
no code implementations • 5 Jul 2017 • Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart
We give the first polynomial-time PAC learning algorithms for these concept classes with dimension-independent error guarantees in the presence of nasty noise under the Gaussian distribution.
no code implementations • 4 May 2017 • Daniel M. Kane, Shachar Lovett, Shay Moran
We construct near optimal linear decision trees for a variety of decision problems in combinatorics and discrete geometry.
no code implementations • 12 Apr 2017 • Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, Alistair Stewart
We give robust estimators that achieve estimation error $O(\varepsilon)$ in the total variation distance, which is optimal up to a universal constant that is independent of the dimension.
no code implementations • 11 Apr 2017 • Daniel M. Kane, Shachar Lovett, Shay Moran, Jiapeng Zhang
We identify a combinatorial dimension, called the \emph{inference dimension}, that captures the query complexity when each additional query is determined by $O(1)$ examples (such as comparison queries, each of which is determined by the two compared examples).
no code implementations • 6 Mar 2017 • Ilias Diakonikolas, Daniel M. Kane, Vladimir Nikishkin
Given a set of samples from two $k$-histogram distributions $p, q$ over $[n]$, we want to distinguish (with high probability) between the cases that $p = q$ and $\|p-q\|_1 \geq \epsilon$.
2 code implementations • ICML 2017 • Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, Alistair Stewart
Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors.
no code implementations • 10 Nov 2016 • Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart
For each of these problems, we show a {\em super-polynomial gap} between the (information-theoretic) sample complexity and the computational complexity of {\em any} Statistical Query algorithm for the problem.
no code implementations • 9 Jun 2016 • Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart
We study the {\em robust proper learning} of univariate log-concave distributions (over continuous and discrete domains).
no code implementations • 26 May 2016 • Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart
Prior to our work, no upper bound on the sample complexity of this learning problem was known for the case of $d>3$.
no code implementations • 24 Nov 2015 • Daniel M. Kane, Ryan Williams
$\bullet$ We give tight average-case (gate and wire) complexity results for computing PARITY with depth-two threshold circuits; the answer turns out to be the same as for depth-two majority circuits.
no code implementations • 12 Nov 2015 • Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart
Given $\widetilde{O}(1/\epsilon^2)$ samples from an unknown PBD $\mathbf{p}$, our algorithm runs in time $(1/\epsilon)^{O(\log \log (1/\epsilon))}$, and outputs a hypothesis PBD that is $\epsilon$-close to $\mathbf{p}$ in total variation distance.
no code implementations • 11 Nov 2015 • Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart
An $(n, k)$-Poisson Multinomial Distribution (PMD) is a random variable of the form $X = \sum_{i=1}^n X_i$, where the $X_i$'s are independent random vectors supported on the set of standard basis vectors in $\mathbb{R}^k.$ In this paper, we obtain a refined structural understanding of PMDs by analyzing their Fourier transform.
no code implementations • 4 May 2015 • Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart
As one of our main structural contributions, we give an efficient algorithm to construct a sparse {\em proper} $\epsilon$-cover for ${\cal S}_{n, k},$ in total variation distance.
no code implementations • 23 Jul 2010 • Daniel M. Kane, Jelani Nelson, Ely Porat, David P. Woodruff
We give a space-optimal algorithm with update time O(log^2(1/eps)loglog(1/eps)) for (1+eps)-approximating the pth frequency moment, 0 < p < 2, of a length-n vector updated in a data stream.
Data Structures and Algorithms