Search Results for author: Benjamin L. Edelman

Found 9 papers, 3 papers with code

The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains

no code implementations16 Feb 2024 Benjamin L. Edelman, Ezra Edelman, Surbhi Goel, Eran Malach, Nikolaos Tsilivis

We examine how learning is affected by varying the prior distribution over Markov chains, and consider the generalization of our in-context learning of Markov chains (ICL-MC) task to $n$-grams for $n > 2$.

In-Context Learning

Distinguishing the Knowable from the Unknowable with Language Models

1 code implementation5 Feb 2024 Gustaf Ahdritz, Tian Qin, Nikhil Vyas, Boaz Barak, Benjamin L. Edelman

We study the feasibility of identifying epistemic uncertainty (reflecting a lack of knowledge), as opposed to aleatoric uncertainty (reflecting entropy in the underlying distribution), in the outputs of large language models (LLMs) over free-form text.

Feature emergence via margin maximization: case studies in algebraic tasks

no code implementations13 Nov 2023 Depen Morwani, Benjamin L. Edelman, Costin-Andrei Oncescu, Rosie Zhao, Sham Kakade

Understanding the internal representations learned by neural networks is a cornerstone challenge in the science of machine learning.

Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models

1 code implementation7 Nov 2023 HANLIN ZHANG, Benjamin L. Edelman, Danilo Francati, Daniele Venturi, Giuseppe Ateniese, Boaz Barak

To prove this result, we introduce a generic efficient watermark attack; the attacker is not required to know the private key of the scheme or even which scheme is used.

Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck

no code implementations7 Sep 2023 Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang

Finally, we show that the synthetic sparse parity task can be useful as a proxy for real problems requiring axis-aligned feature learning.

tabular-classification

Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit

no code implementations18 Jul 2022 Boaz Barak, Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang

There is mounting evidence of emergent phenomena in the capabilities of deep learning methods as we scale up datasets, model sizes, and training times.

Inductive Biases and Variable Creation in Self-Attention Mechanisms

no code implementations19 Oct 2021 Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Cyril Zhang

Self-attention, an architectural motif designed to model long-range interactions in sequential data, has driven numerous recent breakthroughs in natural language processing and beyond.

SGD on Neural Networks Learns Functions of Increasing Complexity

1 code implementation NeurIPS 2019 Preetum Nakkiran, Gal Kaplun, Dimitris Kalimeris, Tristan Yang, Benjamin L. Edelman, Fred Zhang, Boaz Barak

We perform an experimental study of the dynamics of Stochastic Gradient Descent (SGD) in learning deep neural networks for several real and synthetic classification tasks.

Cannot find the paper you are looking for? You can Submit a new open access paper.