Search Results for author: Boaz Barak

Found 17 papers, 8 papers with code

Scaling Data-Constrained Language Models

1 code implementation NeurIPS 2023 Niklas Muennighoff, Alexander M. Rush, Boaz Barak, Teven Le Scao, Aleksandra Piktus, Nouamane Tazi, Sampo Pyysalo, Thomas Wolf, Colin Raffel

We find that with constrained data for a fixed compute budget, training with up to 4 epochs of repeated data yields negligible changes to loss compared to having unique data.

Named Tensor Notation

1 code implementation25 Feb 2021 David Chiang, Alexander M. Rush, Boaz Barak

We propose a notation for tensors with named axes, which relieves the author, reader, and future implementers of machine learning models from the burden of keeping track of the order of axes and the purpose of each.

Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models

1 code implementation7 Nov 2023 HANLIN ZHANG, Benjamin L. Edelman, Danilo Francati, Daniele Venturi, Giuseppe Ateniese, Boaz Barak

To prove this result, we introduce a generic efficient watermark attack; the attacker is not required to know the private key of the scheme or even which scheme is used.

Deconstructing Distributions: A Pointwise Framework of Learning

1 code implementation20 Feb 2022 Gal Kaplun, Nikhil Ghosh, Saurabh Garg, Boaz Barak, Preetum Nakkiran

In this work, we propose a new approach: we measure the performance of a collection of models when evaluated on a $\textit{single input point}$.

Deep Double Descent: Where Bigger Models and More Data Hurt

3 code implementations ICLR 2020 Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak, Ilya Sutskever

We show that a variety of modern deep learning tasks exhibit a "double-descent" phenomenon where, as we increase model size, performance first gets worse and then gets better.

For self-supervised learning, Rationality implies generalization, provably

2 code implementations ICLR 2021 Yamini Bansal, Gal Kaplun, Boaz Barak

We prove a new upper bound on the generalization gap of classifiers that are obtained by first using self-supervision to learn a representation $r$ of the training data, and then fitting a simple (e. g., linear) classifier $g$ to the labels.

Representation Learning Self-Supervised Learning

Distinguishing the Knowable from the Unknowable with Language Models

1 code implementation5 Feb 2024 Gustaf Ahdritz, Tian Qin, Nikhil Vyas, Boaz Barak, Benjamin L. Edelman

We study the feasibility of identifying epistemic uncertainty (reflecting a lack of knowledge), as opposed to aleatoric uncertainty (reflecting entropy in the underlying distribution), in the outputs of large language models (LLMs) over free-form text.

SGD on Neural Networks Learns Functions of Increasing Complexity

1 code implementation NeurIPS 2019 Preetum Nakkiran, Gal Kaplun, Dimitris Kalimeris, Tristan Yang, Benjamin L. Edelman, Fred Zhang, Boaz Barak

We perform an experimental study of the dynamics of Stochastic Gradient Descent (SGD) in learning deep neural networks for several real and synthetic classification tasks.

(Nearly) Efficient Algorithms for the Graph Matching Problem on Correlated Random Graphs

no code implementations NeurIPS 2019 Boaz Barak, Chi-Ning Chou, Zhixian Lei, Tselil Schramm, Yueqi Sheng

Specifically, for every $\gamma>0$, we give a $n^{O(\log n)}$ time algorithm that given a pair of $\gamma$-correlated $G(n, p)$ graphs $G_0, G_1$ with average degree between $n^{\varepsilon}$ and $n^{1/153}$ for $\varepsilon = o(1)$, recovers the "ground truth" permutation $\pi\in S_n$ that matches the vertices of $G_0$ to the vertices of $G_n$ in the way that minimizes the number of mismatched edges.

Graph Matching

Noisy Tensor Completion via the Sum-of-Squares Hierarchy

no code implementations26 Jan 2015 Boaz Barak, Ankur Moitra

This is also the first algorithm for tensor completion that works in the overcomplete case when $r > n$, and in fact it works all the way up to $r = n^{3/2-\epsilon}$.

Matrix Completion

Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method

no code implementations6 Jul 2014 Boaz Barak, Jonathan A. Kelner, David Steurer

We give a new approach to the dictionary learning (also known as "sparse coding") problem of recovering an unknown $n\times m$ matrix $A$ (for $m \geq n$) from examples of the form \[ y = Ax + e, \] where $x$ is a random vector in $\mathbb R^m$ with at most $\tau m$ nonzero coordinates, and $e$ is a random noise vector in $\mathbb R^n$ with bounded magnitude.

Dictionary Learning Tensor Decomposition

Sum-of-squares proofs and the quest toward optimal algorithms

no code implementations21 Apr 2014 Boaz Barak, David Steurer

Two recent developments, the Unique Games Conjecture (UGC) and the Sum-of-Squares (SOS) method, surprisingly suggest that this tailoring is not necessary and that a single efficient algorithm could achieve best possible guarantees for a wide range of different problems.

Rounding Sum-of-Squares Relaxations

no code implementations23 Dec 2013 Boaz Barak, Jonathan Kelner, David Steurer

Aside from being a natural relaxation, this is also motivated by a connection to the Small Set Expansion problem shown by Barak et al. (STOC 2012) and our results yield a certain improvement for that problem.

Open-Ended Question Answering

Revisiting Model Stitching to Compare Neural Representations

no code implementations NeurIPS 2021 Yamini Bansal, Preetum Nakkiran, Boaz Barak

We revisit and extend model stitching (Lenc & Vedaldi 2015) as a methodology to study the internal representations of neural networks.

Self-Supervised Learning

Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit

no code implementations18 Jul 2022 Boaz Barak, Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang

There is mounting evidence of emergent phenomena in the capabilities of deep learning methods as we scale up datasets, model sizes, and training times.

On Provable Copyright Protection for Generative Models

no code implementations21 Feb 2023 Nikhil Vyas, Sham Kakade, Boaz Barak

There is a growing concern that learned conditional generative models may output samples that are substantially similar to some copyrighted data $C$ that was in their training set.

Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning

no code implementations14 Jun 2023 Nikhil Vyas, Depen Morwani, Rosie Zhao, Gal Kaplun, Sham Kakade, Boaz Barak

The success of SGD in deep learning has been ascribed by prior works to the implicit bias induced by high learning rate or small batch size ("SGD noise").

Cannot find the paper you are looking for? You can Submit a new open access paper.