Search Results for author: Boaz Barak

Found 18 papers, 8 papers with code

An Economic Solution to Copyright Challenges of Generative AI

no code implementations • 22 Apr 2024 • Jiachen T. Wang, Zhun Deng, Hiroaki Chiba-Okabe, Boaz Barak, Weijie J. Su

Generative artificial intelligence (AI) systems are trained on large data corpora to generate new pieces of text, images, videos, and other media.

Paper
Add Code

Distinguishing the Knowable from the Unknowable with Language Models

1 code implementation • 5 Feb 2024 • Gustaf Ahdritz, Tian Qin, Nikhil Vyas, Boaz Barak, Benjamin L. Edelman

We study the feasibility of identifying epistemic uncertainty (reflecting a lack of knowledge), as opposed to aleatoric uncertainty (reflecting entropy in the underlying distribution), in the outputs of large language models (LLMs) over free-form text.

Paper
Code

Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models

1 code implementation • 7 Nov 2023 • HANLIN ZHANG, Benjamin L. Edelman, Danilo Francati, Daniele Venturi, Giuseppe Ateniese, Boaz Barak

To prove this result, we introduce a generic efficient watermark attack; the attacker is not required to know the private key of the scheme or even which scheme is used.

Paper
Code

Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning

no code implementations • 14 Jun 2023 • Nikhil Vyas, Depen Morwani, Rosie Zhao, Gal Kaplun, Sham Kakade, Boaz Barak

The success of SGD in deep learning has been ascribed by prior works to the implicit bias induced by high learning rate or small batch size ("SGD noise").

Paper
Add Code

Scaling Data-Constrained Language Models

1 code implementation • NeurIPS 2023 • Niklas Muennighoff, Alexander M. Rush, Boaz Barak, Teven Le Scao, Aleksandra Piktus, Nouamane Tazi, Sampo Pyysalo, Thomas Wolf, Colin Raffel

We find that with constrained data for a fixed compute budget, training with up to 4 epochs of repeated data yields negligible changes to loss compared to having unique data.

298

Paper
Code

On Provable Copyright Protection for Generative Models

no code implementations • 21 Feb 2023 • Nikhil Vyas, Sham Kakade, Boaz Barak

There is a growing concern that learned conditional generative models may output samples that are substantially similar to some copyrighted data $C$ that was in their training set.

Paper
Add Code

Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit

no code implementations • 18 Jul 2022 • Boaz Barak, Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang

There is mounting evidence of emergent phenomena in the capabilities of deep learning methods as we scale up datasets, model sizes, and training times.

Paper
Add Code

Deconstructing Distributions: A Pointwise Framework of Learning

1 code implementation • 20 Feb 2022 • Gal Kaplun, Nikhil Ghosh, Saurabh Garg, Boaz Barak, Preetum Nakkiran

In this work, we propose a new approach: we measure the performance of a collection of models when evaluated on a $\textit{single input point}$.

Paper
Code

Revisiting Model Stitching to Compare Neural Representations

no code implementations • NeurIPS 2021 • Yamini Bansal, Preetum Nakkiran, Boaz Barak

We revisit and extend model stitching (Lenc & Vedaldi 2015) as a methodology to study the internal representations of neural networks.

Self-Supervised Learning

Paper
Add Code

Named Tensor Notation

1 code implementation • 25 Feb 2021 • David Chiang, Alexander M. Rush, Boaz Barak

We propose a notation for tensors with named axes, which relieves the author, reader, and future implementers of machine learning models from the burden of keeping track of the order of axes and the purpose of each.

108

Paper
Code

For self-supervised learning, Rationality implies generalization, provably

2 code implementations • ICLR 2021 • Yamini Bansal, Gal Kaplun, Boaz Barak

We prove a new upper bound on the generalization gap of classifiers that are obtained by first using self-supervision to learn a representation $r$ of the training data, and then fitting a simple (e. g., linear) classifier $g$ to the labels.

Representation Learning Self-Supervised Learning

Paper
Code

Deep Double Descent: Where Bigger Models and More Data Hurt

3 code implementations • ICLR 2020 • Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak, Ilya Sutskever

We show that a variety of modern deep learning tasks exhibit a "double-descent" phenomenon where, as we increase model size, performance first gets worse and then gets better.

Paper
Code

SGD on Neural Networks Learns Functions of Increasing Complexity

1 code implementation • NeurIPS 2019 • Preetum Nakkiran, Gal Kaplun, Dimitris Kalimeris, Tristan Yang, Benjamin L. Edelman, Fred Zhang, Boaz Barak

We perform an experimental study of the dynamics of Stochastic Gradient Descent (SGD) in learning deep neural networks for several real and synthetic classification tasks.

Paper
Code

(Nearly) Efficient Algorithms for the Graph Matching Problem on Correlated Random Graphs

no code implementations • NeurIPS 2019 • Boaz Barak, Chi-Ning Chou, Zhixian Lei, Tselil Schramm, Yueqi Sheng

Specifically, for every $\gamma>0$, we give a $n^{O(\log n)}$ time algorithm that given a pair of $\gamma$-correlated $G(n, p)$ graphs $G_0, G_1$ with average degree between $n^{\varepsilon}$ and $n^{1/153}$ for $\varepsilon = o(1)$, recovers the "ground truth" permutation $\pi\in S_n$ that matches the vertices of $G_0$ to the vertices of $G_n$ in the way that minimizes the number of mismatched edges.

Graph Matching

Paper
Add Code

Noisy Tensor Completion via the Sum-of-Squares Hierarchy

no code implementations • 26 Jan 2015 • Boaz Barak, Ankur Moitra

This is also the first algorithm for tensor completion that works in the overcomplete case when $r > n$, and in fact it works all the way up to $r = n^{3/2-\epsilon}$.

Matrix Completion

Paper
Add Code

Dictionary Learning and Tensor Decomposition via the Sum-of-Squares Method

no code implementations • 6 Jul 2014 • Boaz Barak, Jonathan A. Kelner, David Steurer

We give a new approach to the dictionary learning (also known as "sparse coding") problem of recovering an unknown $n\times m$ matrix $A$ (for $m \geq n$) from examples of the form \[ y = Ax + e, \] where $x$ is a random vector in $\mathbb R^m$ with at most $\tau m$ nonzero coordinates, and $e$ is a random noise vector in $\mathbb R^n$ with bounded magnitude.

Dictionary Learning Tensor Decomposition

Paper
Add Code

Sum-of-squares proofs and the quest toward optimal algorithms

no code implementations • 21 Apr 2014 • Boaz Barak, David Steurer

Two recent developments, the Unique Games Conjecture (UGC) and the Sum-of-Squares (SOS) method, surprisingly suggest that this tailoring is not necessary and that a single efficient algorithm could achieve best possible guarantees for a wide range of different problems.

Paper
Add Code

Rounding Sum-of-Squares Relaxations

no code implementations • 23 Dec 2013 • Boaz Barak, Jonathan Kelner, David Steurer

Aside from being a natural relaxation, this is also motivated by a connection to the Small Set Expansion problem shown by Barak et al. (STOC 2012) and our results yield a certain improvement for that problem.

Open-Ended Question Answering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.