Search Results for author: Neha Gupta

Found 9 papers, 0 papers with code

Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process

no code implementations19 Apr 2019 Guy Blanc, Neha Gupta, Gregory Valiant, Paul Valiant

We characterize the behavior of the training dynamics near any parameter vector that achieves zero training error, in terms of an implicit regularization term corresponding to the sum over the data points, of the squared $\ell_2$ norm of the gradient of the model with respect to the parameter vector, evaluated at each data point.

Active Local Learning

no code implementations31 Aug 2020 Arturs Backurs, Avrim Blum, Neha Gupta

In particular, the number of label queries should be independent of the complexity of $H$, and the function $h$ should be well-defined, independent of $x$.

Universal guarantees for decision tree induction via a higher-order splitting criterion

no code implementations NeurIPS 2020 Guy Blanc, Neha Gupta, Jane Lange, Li-Yang Tan

We propose a simple extension of top-down decision tree learning heuristics such as ID3, C4. 5, and CART.

Estimating decision tree learnability with polylogarithmic sample complexity

no code implementations NeurIPS 2020 Guy Blanc, Neha Gupta, Jane Lange, Li-Yang Tan

We show that top-down decision tree learning heuristics are amenable to highly efficient learnability estimation: for monotone target functions, the error of the decision tree hypothesis constructed by these heuristics can be estimated with polylogarithmically many labeled examples, exponentially smaller than the number necessary to run these heuristics, and indeed, exponentially smaller than information-theoretic minimum required to learn a good decision tree.

Understanding the bias-variance tradeoff of Bregman divergences

no code implementations8 Feb 2022 Ben Adlam, Neha Gupta, Zelda Mariet, Jamie Smith

We show that, similarly to the label, the central prediction can be interpreted as the mean of a random variable, where the mean operates in a dual space defined by the loss function itself.

Ensembling over Classifiers: a Bias-Variance Perspective

no code implementations21 Jun 2022 Neha Gupta, Jamie Smith, Ben Adlam, Zelda Mariet

Empirically, standard ensembling reducesthe bias, leading us to hypothesize that ensembles of classifiers may perform well in part because of this unexpected reduction. We conclude by an empirical analysis of recent deep learning methods that ensemble over hyperparameters, revealing that these techniques indeed favor bias reduction.

When Does Confidence-Based Cascade Deferral Suffice?

no code implementations NeurIPS 2023 Wittawat Jitkrittum, Neha Gupta, Aditya Krishna Menon, Harikrishna Narasimhan, Ankit Singh Rawat, Sanjiv Kumar

Cascades are a classical strategy to enable inference cost to vary adaptively across samples, wherein a sequence of classifiers are invoked in turn.

Language Model Cascades: Token-level uncertainty and beyond

no code implementations15 Apr 2024 Neha Gupta, Harikrishna Narasimhan, Wittawat Jitkrittum, Ankit Singh Rawat, Aditya Krishna Menon, Sanjiv Kumar

While the principles underpinning cascading are well-studied for classification tasks - with deferral based on predicted class uncertainty favored theoretically and practically - a similar understanding is lacking for generative LM tasks.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.