You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • 15 Feb 2022 • Spencer Frei, Niladri S. Chatterji, Peter L. Bartlett

We consider data with binary labels that are generated by an XOR-like function of the input features.

no code implementations • 11 Feb 2022 • Spencer Frei, Niladri S. Chatterji, Peter L. Bartlett

Benign overfitting, the phenomenon where interpolating models generalize well in the presence of noisy data, was first observed in neural network models trained with gradient descent.

no code implementations • NeurIPS 2021 • Spencer Frei, Quanquan Gu

We further show that many existing guarantees for neural networks trained by gradient descent can be unified through proxy convexity and proxy PL inequalities.

no code implementations • 25 Jun 2021 • Spencer Frei, Difan Zou, Zixiang Chen, Quanquan Gu

We show that there exists a universal constant $C_{\mathrm{err}}>0$ such that if a pseudolabeler $\boldsymbol{\beta}_{\mathrm{pl}}$ can achieve classification error at most $C_{\mathrm{err}}$, then for any $\varepsilon>0$, an iterative self-training algorithm initialized at $\boldsymbol{\beta}_0 := \boldsymbol{\beta}_{\mathrm{pl}}$ using pseudolabels $\hat y = \mathrm{sgn}(\langle \boldsymbol{\beta}_t, \mathbf{x}\rangle)$ and using at most $\tilde O(d/\varepsilon^2)$ unlabeled examples suffices to learn the Bayes-optimal classifier up to $\varepsilon$ error, where $d$ is the ambient dimension.

no code implementations • 19 Apr 2021 • Difan Zou, Spencer Frei, Quanquan Gu

To the best of our knowledge, this is the first work to show that adversarial training provably yields robust classifiers in the presence of noise.

1 code implementation • 4 Jan 2021 • Spencer Frei, Yuan Cao, Quanquan Gu

We consider a one-hidden-layer leaky ReLU network of arbitrary width trained by stochastic gradient descent (SGD) following an arbitrary initialization.

no code implementations • 1 Oct 2020 • Spencer Frei, Yuan Cao, Quanquan Gu

We analyze the properties of gradient descent on convex surrogates for the zero-one loss for the agnostic learning of linear halfspaces.

no code implementations • NeurIPS 2020 • Spencer Frei, Yuan Cao, Quanquan Gu

In the agnostic PAC learning setting, where no assumption on the relationship between the labels $y$ and the input $x$ is made, if the optimal population risk is $\mathsf{OPT}$, we show that gradient descent achieves population risk $O(\mathsf{OPT})+\epsilon$ in polynomial time and sample complexity when $\sigma$ is strictly increasing.

no code implementations • NeurIPS 2019 • Spencer Frei, Yuan Cao, Quanquan Gu

The skip-connections used in residual networks have become a standard architecture choice in deep learning due to the increased training stability and generalization performance with this architecture, although there has been limited theoretical understanding for this improvement.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.