Search Results for author: Enric Boix-Adsera

Found 15 papers, 5 papers with code

Towards a theory of model distillation

1 code implementation • 14 Mar 2024 • Enric Boix-Adsera

Distillation is the task of replacing a complicated machine learning model with a simpler model that approximates the original [BCNM06, HVD15].

PAC learning

Paper
Code

Prompt have evil twins

1 code implementation • 13 Nov 2023 • Rimon Melamed, Lucas H. McCabe, Tanay Wakhare, Yejin Kim, H. Howie Huang, Enric Boix-Adsera

We discover that many natural-language prompts can be replaced by corresponding prompts that are unintelligible to humans but that provably elicit similar behavior in language models.

Paper
Code

When can transformers reason with abstract symbols?

1 code implementation • 15 Oct 2023 • Enric Boix-Adsera, Omid Saremi, Emmanuel Abbe, Samy Bengio, Etai Littwin, Joshua Susskind

We investigate the capabilities of transformer models on relational reasoning tasks.

Relational Reasoning

Paper
Code

Transformers learn through gradual rank increase

no code implementations • NeurIPS 2023 • Enric Boix-Adsera, Etai Littwin, Emmanuel Abbe, Samy Bengio, Joshua Susskind

Our experiments support the theory and also show that phenomenon can occur in practice without the simplifying assumptions.

Incremental Learning

Paper
Add Code

Tight conditions for when the NTK approximation is valid

no code implementations • 22 May 2023 • Enric Boix-Adsera, Etai Littwin

We study when the neural tangent kernel (NTK) approximation is valid for training a model with the square loss.

valid

Paper
Add Code

SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics

no code implementations • 21 Feb 2023 • Emmanuel Abbe, Enric Boix-Adsera, Theodor Misiakiewicz

For $d$-dimensional uniform Boolean or isotropic Gaussian data, our main conjecture states that the time complexity to learn a function $f$ with low-dimensional support is $\tilde\Theta (d^{\max(\mathrm{Leap}(f), 2)})$.

Paper
Add Code

GULP: a prediction-based metric between representations

1 code implementation • 12 Oct 2022 • Enric Boix-Adsera, Hannah Lawrence, George Stepaniants, Philippe Rigollet

Comparing the representations learned by different neural networks has recently emerged as a key tool to understand various architectures and ultimately optimize them.

Paper
Code

On the non-universality of deep learning: quantifying the cost of symmetry

no code implementations • 5 Aug 2022 • Emmanuel Abbe, Enric Boix-Adsera

We prove limitations on what neural networks trained by noisy gradient descent (GD) can efficiently learn.

Paper
Add Code

The merged-staircase property: a necessary and nearly sufficient condition for SGD learning of sparse functions on two-layer neural networks

no code implementations • 17 Feb 2022 • Emmanuel Abbe, Enric Boix-Adsera, Theodor Misiakiewicz

It is currently known how to characterize functions that neural networks can learn with SGD for two extremal parameterizations: neural networks in the linear regime, and neural networks with no structural constraints.

Paper
Add Code

The staircase property: How hierarchical structure can guide deep learning

no code implementations • NeurIPS 2021 • Emmanuel Abbe, Enric Boix-Adsera, Matthew Brennan, Guy Bresler, Dheeraj Nagaraj

This paper identifies a structural property of data distributions that enables deep neural networks to learn hierarchically.

Paper
Add Code

Chow-Liu++: Optimal Prediction-Centric Learning of Tree Ising Models

no code implementations • 7 Jun 2021 • Enric Boix-Adsera, Guy Bresler, Frederic Koehler

In this paper, we introduce a new algorithm that carefully combines elements of the Chow-Liu algorithm with tree metric reconstruction methods to efficiently and optimally learn tree Ising models under a prediction-centric loss.

Paper
Add Code

Wasserstein barycenters are NP-hard to compute

no code implementations • 4 Jan 2021 • Jason M. Altschuler, Enric Boix-Adsera

Moreover, our hardness results for computing Wasserstein barycenters extend to approximate computation, to seemingly simple cases of the problem, and to averaging probability distributions in other Optimal Transport metrics.

Open-Ended Question Answering

Paper
Add Code

Hardness results for Multimarginal Optimal Transport problems

no code implementations • 10 Dec 2020 • Jason M. Altschuler, Enric Boix-Adsera

We demonstrate this toolkit by using it to establish the intractability of a number of MOT problems studied in the literature that have resisted previous algorithmic efforts.

Paper
Add Code

Polynomial-time algorithms for Multimarginal Optimal Transport problems with structure

1 code implementation • 7 Aug 2020 • Jason M. Altschuler, Enric Boix-Adsera

We illustrate this ease-of-use by developing poly(n, k) time algorithms for three general classes of MOT cost structures: (1) graphical structure; (2) set-optimization structure; and (3) low-rank plus sparse structure.

BIG-bench Machine Learning

Paper
Code

Wasserstein barycenters can be computed in polynomial time in fixed dimension

no code implementations • 14 Jun 2020 • Jason M. Altschuler, Enric Boix-Adsera

Computing Wasserstein barycenters is a fundamental geometric problem with widespread applications in machine learning, statistics, and computer graphics.

BIG-bench Machine Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.