no code implementations • 22 Nov 2024 • Anindya De, Shivam Nadimpalli, Ryan O'Donnell, Rocco A. Servedio
We give a dimension-independent sparsification result for suprema of centered Gaussian processes: Let $T$ be any (possibly infinite) bounded set of vectors in $\mathbb{R}^n$, and let $\{{\boldsymbol{X}}_t\}_{t\in T}$ be the canonical Gaussian process on $T$.
no code implementations • 8 Dec 2021 • Philip M. Long, Rocco A. Servedio
Van Rooyen et al. introduced a notion of convex loss functions being robust to random classification noise, and established that the "unhinged" loss function is robust in this sense.
no code implementations • 3 Feb 2021 • Daniel Hsu, Clayton Sanford, Rocco A. Servedio, Emmanouil-Vasileios Vlatakis-Gkaragkounis
This paper considers the following question: how well can depth-two ReLU networks with randomly initialized bottom-level weights represent smooth functions?
no code implementations • 22 Dec 2020 • Anindya De, Shivam Nadimpalli, Rocco A. Servedio
Most correlation inequalities for high-dimensional functions in the literature, such as the Fortuin-Kasteleyn-Ginibre (FKG) inequality and the celebrated Gaussian Correlation Inequality of Royen, are qualitative statements which establish that any two functions of a certain type have non-negative correlation.
Probability Computational Complexity Combinatorics
no code implementations • 12 Jul 2019 • Frank Ban, Xi Chen, Rocco A. Servedio, Sandip Sinha
In this problem, there is an unknown distribution $\cal{D}$ over $s$ unknown source strings $x^1,\dots, x^s \in \{0, 1\}^n$, and each sample is independently generated by drawing some $x^i$ from $\cal{D}$ and returning an independent trace of $x^i$.
no code implementations • 2 Jul 2019 • Clément L. Canonne, Anindya De, Rocco A. Servedio
We give a range of efficient algorithms and hardness results for this problem, focusing on the case when $f$ is a low-degree polynomial threshold function (PTF).
no code implementations • 9 Nov 2018 • Anindya De, Philip M. Long, Rocco A. Servedio
This implies that, for constant $d$, multivariate log-concave distributions can be learned in $\tilde{O}_d(1/\epsilon^{2d+2})$ time using $\tilde{O}_d(1/\epsilon^{d+2})$ samples, answering a question of [Diakonikolas, Kane and Stewart, 2016] All of our results extend to a model of noise-tolerant density estimation using Huber's contamination model, in which the target distribution to be learned is a $(1-\epsilon,\epsilon)$ mixture of some unknown distribution in the class with some other arbitrary and unknown distribution, and the learning algorithm must output a hypothesis distribution with total variation distance error $O(\epsilon)$ from the target distribution.
no code implementations • 18 Jul 2018 • Anindya De, Philip M. Long, Rocco A. Servedio
For the case $| \mathcal{A} | = 3$, we give an algorithm for learning $\mathcal{A}$-sums to accuracy $\epsilon$ that uses $\mathsf{poly}(1/\epsilon)$ samples and runs in time $\mathsf{poly}(1/\epsilon)$, independent of $N$ and of the elements of $\mathcal{A}$.
no code implementations • NeurIPS 2014 • Siu-On Chan, Ilias Diakonikolas, Rocco A. Servedio, Xiaorui Sun
The "approximation factor" $C$ in our result is inherent in the problem, as we prove that no algorithm with sample size bounded in terms of $k$ and $\epsilon$ can achieve $C<2$ regardless of what kind of hypothesis distribution it uses.
no code implementations • 30 Oct 2014 • Eric Blais, Clément L. Canonne, Igor C. Oliveira, Rocco A. Servedio, Li-Yang Tan
In this paper we study the structure of Boolean functions in terms of the minimum number of negations in any circuit computing them, a complexity measure that interpolates between monotone functions and the class of all functions.
no code implementations • 14 May 2013 • Siu-On Chan, Ilias Diakonikolas, Rocco A. Servedio, Xiaorui Sun
We give an algorithm that draws $\tilde{O}(t\new{(d+1)}/\eps^2)$ samples from $p$, runs in time $\poly(t, d, 1/\eps)$, and with high probability outputs a piecewise polynomial hypothesis distribution $h$ that is $(O(\tau)+\eps)$-close (in total variation distance) to $p$.
no code implementations • 7 Nov 2012 • Anindya De, Ilias Diakonikolas, Rocco A. Servedio
In such an inverse problem, the algorithm is given uniform random satisfying assignments of an unknown function $f$ belonging to a class $\C$ of Boolean functions, and the goal is to output a probability distribution $D$ which is $\epsilon$-close, in total variation distance, to the uniform distribution over $f^{-1}(1)$.
no code implementations • 13 Jul 2011 • Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio
The learning algorithm is given access to independent samples drawn from an unknown $k$-modal distribution $p$, and it must output a hypothesis distribution $\widehat{p}$ such that with high probability the total variation distance between $p$ and $\widehat{p}$ is at most $\epsilon.$ Our main goal is to obtain \emph{computationally efficient} algorithms for this problem that use (close to) an information-theoretically optimal number of samples.
no code implementations • 13 Jul 2011 • Constantinos Daskalakis, Ilias Diakonikolas, Rocco A. Servedio
Our second main result is a {\em proper} learning algorithm that learns to $\eps$-accuracy using $\tilde{O}(1/\eps^2)$ samples, and runs in time $(1/\eps)^{\poly (\log (1/\eps))} \cdot \log n$.