Search Results for author: Paul Valiant

Found 11 papers, 1 papers with code

Depth Separations in Neural Networks: Separating the Dimension from the Accuracy

no code implementations11 Feb 2024 Itay Safran, Daniel Reichman, Paul Valiant

We prove an exponential separation between depth 2 and depth 3 neural networks, when approximating an $\mathcal{O}(1)$-Lipschitz target function to constant accuracy, with respect to a distribution with support in $[0, 1]^{d}$, assuming exponentially bounded weights.

Optimality in Mean Estimation: Beyond Worst-Case, Beyond Sub-Gaussian, and Beyond $1+α$ Moments

no code implementations21 Nov 2023 Trung Dang, Jasper C. H. Lee, Maoyuan Song, Paul Valiant

The state of the art results for mean estimation in $\mathbb{R}$ are 1) the optimal sub-Gaussian mean estimator by [LV22], with the tight sub-Gaussian constant for all distributions with finite but unknown variance, and 2) the analysis of the median-of-means algorithm by [BCL13] and a lower bound by [DLLO16], characterizing the big-O optimal errors for distributions for which only a $1+\alpha$ moment exists for $\alpha \in (0, 1)$.

How Many Neurons Does it Take to Approximate the Maximum?

no code implementations18 Jul 2023 Itay Safran, Daniel Reichman, Paul Valiant

Our depth separation results are facilitated by a new lower bound for depth 2 networks approximating the maximum function over the uniform distribution, assuming an exponential upper bound on the size of the weights.

Finite-Sample Maximum Likelihood Estimation of Location

no code implementations6 Jun 2022 Shivam Gupta, Jasper C. H. Lee, Eric Price, Paul Valiant

We consider 1-dimensional location estimation, where we estimate a parameter $\lambda$ from $n$ samples $\lambda + \eta_i$, with each $\eta_i$ drawn i. i. d.

Optimal Sub-Gaussian Mean Estimation in $\mathbb{R}$

no code implementations17 Nov 2020 Jasper C. H. Lee, Paul Valiant

We revisit the problem of estimating the mean of a real-valued distribution, presenting a novel estimator with sub-Gaussian convergence: intuitively, "our estimator, on any distribution, is as accurate as the sample mean is for the Gaussian distribution of matching variance."

Worst-Case Analysis for Randomly Collected Data

1 code implementation NeurIPS 2020 Justin Y. Chen, Gregory Valiant, Paul Valiant

Crucially, we assume that the sets $A$ and $B$ are drawn according to some known distribution $P$ over pairs of subsets of $[n]$.

Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process

no code implementations19 Apr 2019 Guy Blanc, Neha Gupta, Gregory Valiant, Paul Valiant

We characterize the behavior of the training dynamics near any parameter vector that achieves zero training error, in terms of an implicit regularization term corresponding to the sum over the data points, of the squared $\ell_2$ norm of the gradient of the model with respect to the parameter vector, evaluated at each data point.

Uncertainty about Uncertainty: Optimal Adaptive Algorithms for Estimating Mixtures of Unknown Coins

no code implementations19 Apr 2019 Jasper C. H. Lee, Paul Valiant

Given a mixture between two populations of coins, "positive" coins that each have -- unknown and potentially different -- bias $\geq\frac{1}{2}+\Delta$ and "negative" coins with bias $\leq\frac{1}{2}-\Delta$, we consider the task of estimating the fraction $\rho$ of positive coins to within additive error $\epsilon$.

LEMMA

Instance Optimal Learning

no code implementations21 Apr 2015 Gregory Valiant, Paul Valiant

One conceptual implication of this result is that for large samples, Bayesian assumptions on the "shape" or bounds on the tail probabilities of a distribution over discrete support are not helpful for the task of learning the distribution.

Estimating the Unseen: Improved Estimators for Entropy and other Properties

no code implementations NeurIPS 2013 Paul Valiant, Gregory Valiant

Recently, [Valiant and Valiant] showed that a class of distributional properties, which includes such practically relevant properties as entropy, the number of distinct elements, and distance metrics between pairs of distributions, can be estimated given a SUBLINEAR sized sample.

Cannot find the paper you are looking for? You can Submit a new open access paper.