Search Results for author: Surbhi Goel

Found 32 papers, 2 papers with code

Complexity Matters: Dynamics of Feature Learning in the Presence of Spurious Correlations

1 code implementation5 Mar 2024 Guanwen Qiu, Da Kuang, Surbhi Goel

Existing research often posits spurious features as "easier" to learn than core features in neural network optimization, but the impact of their relative simplicity remains under-explored.

The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains

no code implementations16 Feb 2024 Benjamin L. Edelman, Ezra Edelman, Surbhi Goel, Eran Malach, Nikolaos Tsilivis

We examine how learning is affected by varying the prior distribution over Markov chains, and consider the generalization of our in-context learning of Markov chains (ICL-MC) task to $n$-grams for $n > 2$.

In-Context Learning

Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck

no code implementations7 Sep 2023 Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang

Finally, we show that the synthetic sparse parity task can be useful as a proxy for real problems requiring axis-aligned feature learning.

tabular-classification

Adversarial Resilience in Sequential Prediction via Abstention

no code implementations NeurIPS 2023 Surbhi Goel, Steve Hanneke, Shay Moran, Abhishek Shetty

We study the problem of sequential prediction in the stochastic setting with an adversary that is allowed to inject clean-label adversarial (or out-of-distribution) examples.

Learning Narrow One-Hidden-Layer ReLU Networks

no code implementations20 Apr 2023 Sitan Chen, Zehao Dou, Surbhi Goel, Adam R Klivans, Raghu Meka

We consider the well-studied problem of learning a linear combination of $k$ ReLU activations with respect to a Gaussian distribution on inputs in $d$ dimensions.

Transformers Learn Shortcuts to Automata

no code implementations19 Oct 2022 Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang

Algorithmic reasoning requires capabilities which are most naturally understood through recurrent models of computation, like the Turing machine.

Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms

no code implementations1 Sep 2022 Surbhi Goel, Sham Kakade, Adam Tauman Kalai, Cyril Zhang

For example, on parity problems, the NN learns as well as Gaussian elimination, an efficient algorithm that can be succinctly described.

Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit

no code implementations18 Jul 2022 Boaz Barak, Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang

There is mounting evidence of emergent phenomena in the capabilities of deep learning methods as we scale up datasets, model sizes, and training times.

Understanding Contrastive Learning Requires Incorporating Inductive Biases

no code implementations28 Feb 2022 Nikunj Saunshi, Jordan Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham Kakade, Akshay Krishnamurthy

Contrastive learning is a popular form of self-supervised learning that encourages augmentations (views) of the same input to have more similar representations compared to augmentations of different inputs.

Contrastive Learning Self-Supervised Learning

Anti-Concentrated Confidence Bonuses for Scalable Exploration

no code implementations ICLR 2022 Jordan T. Ash, Cyril Zhang, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade

Intrinsic rewards play a central role in handling the exploration-exploitation trade-off when designing sequential decision-making algorithms, in both foundational theory and state-of-the-art deep reinforcement learning.

Decision Making reinforcement-learning +1

Inductive Biases and Variable Creation in Self-Attention Mechanisms

no code implementations19 Oct 2021 Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Cyril Zhang

Self-attention, an architectural motif designed to model long-range interactions in sequential data, has driven numerous recent breakthroughs in natural language processing and beyond.

Statistical Estimation from Dependent Data

no code implementations20 Jul 2021 Yuval Dagan, Constantinos Daskalakis, Nishanth Dikkala, Surbhi Goel, Anthimos Vardis Kandiros

We consider a general statistical estimation problem wherein binary labels across different observations are not independent conditioned on their feature vectors, but dependent, capturing settings where e. g. these observations are collected on a spatial domain, a temporal domain, or a social network, which induce dependencies.

regression text-classification +1

Gone Fishing: Neural Active Learning with Fisher Embeddings

1 code implementation NeurIPS 2021 Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade

There is an increasing need for effective active learning algorithms that are compatible with deep neural networks.

Active Learning

Acceleration via Fractal Learning Rate Schedules

no code implementations1 Mar 2021 Naman Agarwal, Surbhi Goel, Cyril Zhang

In practical applications of iterative first-order optimization, the learning rate schedule remains notoriously difficult to understand and expensive to tune.

Tight Hardness Results for Training Depth-2 ReLU Networks

no code implementations27 Nov 2020 Surbhi Goel, Adam Klivans, Pasin Manurangsi, Daniel Reichman

We are also able to obtain lower bounds on the running time in terms of the desired additive error $\epsilon$.

From Boltzmann Machines to Neural Networks and Back Again

no code implementations NeurIPS 2020 Surbhi Goel, Adam Klivans, Frederic Koehler

Graphical models are powerful tools for modeling high-dimensional data, but learning graphical models in the presence of latent variables is well-known to be difficult.

Statistical-Query Lower Bounds via Functional Gradients

no code implementations NeurIPS 2020 Surbhi Goel, Aravind Gollakota, Adam Klivans

We give the first statistical-query lower bounds for agnostically learning any non-polynomial activation with respect to Gaussian marginals (e. g., ReLU, sigmoid, sign).

Approximation Schemes for ReLU Regression

no code implementations26 May 2020 Ilias Diakonikolas, Surbhi Goel, Sushrut Karmalkar, Adam R. Klivans, Mahdi Soltanolkotabi

We consider the fundamental problem of ReLU regression, where the goal is to output the best fitting ReLU with respect to square loss given access to draws from some unknown distribution.

regression

Efficiently Learning Adversarially Robust Halfspaces with Noise

no code implementations ICML 2020 Omar Montasser, Surbhi Goel, Ilias Diakonikolas, Nathan Srebro

We study the problem of learning adversarially robust halfspaces in the distribution-independent setting.

Learning Restricted Boltzmann Machines with Arbitrary External Fields

no code implementations15 Jun 2019 Surbhi Goel

We study the problem of learning graphical models with latent variables.

Learning Mixtures of Graphs from Epidemic Cascades

no code implementations ICML 2020 Jessica Hoffmann, Soumya Basu, Surbhi Goel, Constantine Caramanis

When the conditions are met, i. e., when the graphs are connected with at least three edges, we give an efficient algorithm for learning the weights of both graphs with optimal sample complexity (up to log factors).

Quantifying Perceptual Distortion of Adversarial Examples

no code implementations21 Feb 2019 Matt Jordan, Naren Manoj, Surbhi Goel, Alexandros G. Dimakis

To demonstrate the value of quantifying the perceptual distortion of adversarial examples, we present and employ a unifying framework fusing different attack styles.

SSIM

Learning Ising Models with Independent Failures

no code implementations13 Feb 2019 Surbhi Goel, Daniel M. Kane, Adam R. Klivans

We give the first efficient algorithm for learning the structure of an Ising model that tolerates independent failures; that is, each entry of the observed sample is missing with some unknown probability p. Our algorithm matches the essentially optimal runtime and sample complexity bounds of recent work for learning Ising models due to Klivans and Meka (2017).

Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps

no code implementations ICLR 2019 Simon S. Du, Surbhi Goel

We propose a new algorithm to learn a one-hidden-layer convolutional neural network where both the convolutional weights and the outputs weights are parameters to be learned.

regression

Learning One Convolutional Layer with Overlapping Patches

no code implementations ICML 2018 Surbhi Goel, Adam Klivans, Raghu Meka

We give the first provably efficient algorithm for learning a one hidden layer convolutional network with respect to a general class of (potentially overlapping) patches.

Learning Neural Networks with Two Nonlinear Layers in Polynomial Time

no code implementations18 Sep 2017 Surbhi Goel, Adam Klivans

We give a polynomial-time algorithm for learning neural networks with one layer of sigmoids feeding into any Lipschitz, monotone activation function (e. g., sigmoid or ReLU).

Learning Theory PAC learning +1

Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks

no code implementations NeurIPS 2017 Surbhi Goel, Adam Klivans

In this work we show that a natural distributional assumption corresponding to {\em eigenvalue decay} of the Gram matrix yields polynomial-time algorithms in the non-realizable setting for expressive classes of networks (e. g. feed-forward networks of ReLUs).

Reliably Learning the ReLU in Polynomial Time

no code implementations30 Nov 2016 Surbhi Goel, Varun Kanade, Adam Klivans, Justin Thaler

These results are in contrast to known efficient algorithms for reliably learning linear threshold functions, where $\epsilon$ must be $\Omega(1)$ and strong assumptions are required on the marginal distribution.

Cannot find the paper you are looking for? You can Submit a new open access paper.