1 code implementation • 15 Oct 2023 • Enric Boix-Adsera, Omid Saremi, Emmanuel Abbe, Samy Bengio, Etai Littwin, Joshua Susskind
We investigate the capabilities of transformer models on relational reasoning tasks.
no code implementations • NeurIPS 2023 • Enric Boix-Adsera, Etai Littwin, Emmanuel Abbe, Samy Bengio, Joshua Susskind
Our experiments support the theory and also show that phenomenon can occur in practice without the simplifying assumptions.
no code implementations • 21 Feb 2023 • Emmanuel Abbe, Enric Boix-Adsera, Theodor Misiakiewicz
For $d$-dimensional uniform Boolean or isotropic Gaussian data, our main conjecture states that the time complexity to learn a function $f$ with low-dimensional support is $\tilde\Theta (d^{\max(\mathrm{Leap}(f), 2)})$.
1 code implementation • 30 Jan 2023 • Emmanuel Abbe, Samy Bengio, Aryo Lotfi, Kevin Rizk
This paper considers the learning of logical (Boolean) functions with focus on the generalization on the unseen (GOTU) setting, a strong case of out-of-distribution generalization.
no code implementations • 5 Aug 2022 • Emmanuel Abbe, Enric Boix-Adsera
We prove limitations on what neural networks trained by noisy gradient descent (GD) can efficiently learn.
1 code implementation • 26 May 2022 • Emmanuel Abbe, Samy Bengio, Elisabetta Cornacchia, Jon Kleinberg, Aryo Lotfi, Maithra Raghu, Chiyuan Zhang
More generally, the paper considers the learning of logical functions with gradient descent (GD) on neural networks.
no code implementations • 25 Feb 2022 • Emmanuel Abbe, Elisabetta Cornacchia, Jan Hązła, Christopher Marquis
This paper introduces the notion of ``Initial Alignment'' (INAL) between a neural network at initialization and a target function.
no code implementations • 17 Feb 2022 • Emmanuel Abbe, Enric Boix-Adsera, Theodor Misiakiewicz
It is currently known how to characterize functions that neural networks can learn with SGD for two extremal parameterizations: neural networks in the linear regime, and neural networks with no structural constraints.
no code implementations • 4 Nov 2021 • Emmanuel Abbe, Shuangping Li, Allan Sly
It was recently shown that almost all solutions in the symmetric binary perceptron are isolated, even at low constraint densities, suggesting that finding typical solutions is hard.
no code implementations • NeurIPS 2021 • Emmanuel Abbe, Enric Boix-Adsera, Matthew Brennan, Guy Bresler, Dheeraj Nagaraj
This paper identifies a structural property of data distributions that enables deep neural networks to learn hierarchically.
no code implementations • NeurIPS 2021 • Emmanuel Abbe, Pritish Kamath, Eran Malach, Colin Sandon, Nathan Srebro
With fine enough precision relative to minibatch size, namely when $b \rho$ is small enough, SGD can go beyond SQ learning and simulate any sample-based learning algorithm and thus its learning power is equivalent to that of PAC learning; this extends prior work that achieved this result for $b=1$.
no code implementations • 1 Mar 2021 • Eran Malach, Pritish Kamath, Emmanuel Abbe, Nathan Srebro
Complementing this, we show that without these conditions, gradient descent can in fact learn with small error even when no kernel method, in particular using the tangent kernel, can achieve a non-trivial advantage over random guessing.
no code implementations • 25 Feb 2021 • Emmanuel Abbe, Shuangping Li, Allan Sly
We consider the symmetric binary perceptron model, a simple model of neural networks that has gathered significant attention in the statistical physics, information theory and probability theory communities, with recent connections made to the performance of learning algorithms in Baldassi et al. '15.
no code implementations • 29 Jan 2021 • Emmanuel Abbe, Elisabetta Cornacchia, Yuzhou Gu, Yury Polyanskiy
The limit of the entropy in the stochastic block model (SBM) has been characterized in the sparse regime for the special case of disassortative communities [COKPZ17] and for the classical case of assortative communities but in the dense regime [DAM16].
Probability Information Theory Information Theory
no code implementations • NeurIPS 2020 • Emmanuel Abbe, Colin Sandon
This paper shows that deep learning, i. e., neural networks trained by SGD, can learn in polytime any function class that can be learned in polytime by some algorithm, including parities.
no code implementations • 25 Jun 2020 • Amir R. Asadi, Emmanuel Abbe
For different entropies and arbitrary scale transformations, it is shown that the distribution maximizing a multiscale entropy is characterized by a procedure which has an analogy to the renormalization group procedure in statistical physics.
no code implementations • 24 Jun 2020 • Emmanuel Abbe, Jianqing Fan, Kaizheng Wang
Principal Component Analysis (PCA) is a powerful tool in statistics and machine learning.
no code implementations • 13 Jun 2020 • Emmanuel Abbe, Shuangping Li, Allan Sly
The problem of learning graphons has attracted considerable attention across several scientific communities, with significant progress over the recent years in sparser regimes.
no code implementations • 7 Jan 2020 • Emmanuel Abbe, Colin Sandon
Therefore deep learning provides a universal learning paradigm: it was known that the approximation and estimation errors could be controlled with poly-size neural nets, using ERM that is NP-hard; this new result shows that the optimization error can also be controlled with SGD in poly-time.
1 code implementation • 26 Jun 2019 • Amir R. Asadi, Emmanuel Abbe
The bounds are obtained by introducing the notion of generated hierarchical coverings of neural nets and by using the technique of chaining mutual information introduced in Asadi et al. NeurIPS'18.
no code implementations • 16 Dec 2018 • Emmanuel Abbe, Colin Sandon
As the success of deep learning reaches more grounds, one would like to also envision the potential limits of deep learning.
no code implementations • NeurIPS 2018 • Amir R. Asadi, Emmanuel Abbe, Sergio Verdú
Two important difficulties are (i) exploiting the dependencies between the hypotheses, (ii) exploiting the dependence between the algorithm's input and output.
no code implementations • ICML 2018 • Min Ye, Emmanuel Abbe
This paper develops coding techniques to reduce the running time of distributed learning tasks.
no code implementations • NeurIPS 2017 • Emmanuel Abbe, Sanjeev Kulkarni, Eun Jee Lee
This paper develops upper and lower bounds on the influence measure in a network, more precisely, the expected number of nodes that a seed set can influence in the independent cascade model.
no code implementations • 29 Mar 2017 • Emmanuel Abbe
This monograph surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational tradeoffs, and for various recovery requirements such as exact, partial and weak recovery.
no code implementations • NeurIPS 2016 • Emmanuel Abbe, Colin Sandon
The stochastic block model (SBM) has long been studied in machine learning and network science as a canonical model for clustering and community detection.
no code implementations • 30 Dec 2015 • Emmanuel Abbe, Colin Sandon
In a paper that initiated the modern study of the stochastic block model, Decelle et al., backed by Mossel et al., made the following conjecture: Denote by $k$ the number of balanced communities, $a/n$ the probability of connecting inside communities and $b/n$ across, and set $\mathrm{SNR}=(a-b)^2/(k(a+(k-1)b)$; for any $k \geq 2$, it is possible to detect communities efficiently whenever $\mathrm{SNR}>1$ (the KS threshold), whereas for $k\geq 4$, it is possible to detect communities information-theoretically for some $\mathrm{SNR}<1$.
no code implementations • NeurIPS 2015 • Emmanuel Abbe, Colin Sandon
Most recent developments on the stochastic block model (SBM) rely on the knowledge of the model parameters, or at least on the number of communities.