no code implementations • 12 Nov 2024 • Samuel J. Bell, Mariano Coria Meglioli, Megan Richards, Eduardo Sánchez, Christophe Ropers, Skyler Wang, Adina Williams, Levent Sagun, Marta R. Costa-jussà
Text toxicity detection systems exhibit significant biases, producing disproportionate rates of false positives on samples mentioning demographic groups.
no code implementations • 6 Nov 2024 • Anaelia Ovalle, Krunoslav Lehman Pavasovic, Louis Martin, Luke Zettlemoyer, Eric Michael Smith, Adina Williams, Levent Sagun
Natural-language assistants are designed to provide users with helpful responses while avoiding harmful outputs, largely achieved through alignment to human preferences.
no code implementations • 7 Oct 2024 • Arjun Subramonian, Samuel J. Bell, Levent Sagun, Elvis Dohmatob
Machine learning models may capture and amplify biases present in data, leading to disparate test performance across social groups.
no code implementations • 6 Sep 2024 • Samuel J. Bell, Diane Bouchacourt, Levent Sagun
Neural networks can fail when the data contains spurious correlations.
1 code implementation • 29 Sep 2023 • Arjun Subramonian, Levent Sagun, Yizhou Sun
We further bridge GCN's preferential attachment bias with unfairness in link prediction and propose a new within-group fairness metric.
1 code implementation • 11 Jul 2023 • Arjun Subramonian, Adina Williams, Maximilian Nickel, Yizhou Sun, Levent Sagun
The expressive power of graph neural networks is usually measured by comparing how many pairs of graphs or nodes an architecture can possibly distinguish as non-isomorphic to those distinguishable by the $k$-dimensional Weisfeiler-Leman ($k$-WL) test.
no code implementations • 13 Dec 2022 • Samuel J. Bell, Levent Sagun
Finally, we present two real-world examples of difficulty amplification in action, resulting in worse-than-expected performance disparities between groups even when using a balanced dataset.
no code implementations • 20 Jul 2022 • David Lopez-Paz, Diane Bouchacourt, Levent Sagun, Nicolas Usunier
By highlighting connections to the literature in domain generalization, we propose to measure fairness as the ability of the system to generalize under multiple stress tests -- distributions of examples with social relevance.
no code implementations • 28 Mar 2022 • Berfin Simsek, Melissa Hall, Levent Sagun
Existing works show that although modern neural networks achieve remarkable generalization performance on the in-distribution (ID) dataset, the accuracy drops significantly on the out-of-distribution (OOD) datasets \cite{recht2018cifar, recht2019imagenet}.
1 code implementation • 16 Feb 2022 • Priya Goyal, Quentin Duval, Isaac Seessel, Mathilde Caron, Ishan Misra, Levent Sagun, Armand Joulin, Piotr Bojanowski
Discriminative self-supervised learning allows training models on any random group of internet images, and possibly recover salient information that helps differentiate between the images.
Ranked #1 on Copy Detection on Copydays strong subset (using extra training data)
1 code implementation • 15 Feb 2022 • Priya Goyal, Adriana Romero Soriano, Caner Hazirbas, Levent Sagun, Nicolas Usunier
Systematic diagnosis of fairness, harms, and biases of computer vision systems is an important step towards building socially responsible systems.
no code implementations • 10 Jun 2021 • Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Ari Morcos
Finally, we experiment initializing the T-CNN from a partially trained CNN, and find that it reaches better performance than the corresponding hybrid model trained from scratch, while reducing training time.
9 code implementations • 19 Mar 2021 • Stéphane d'Ascoli, Hugo Touvron, Matthew Leavitt, Ari Morcos, Giulio Biroli, Levent Sagun
We initialise the GPSA layers to mimic the locality of convolutional layers, then give each attention head the freedom to escape locality by adjusting a gating parameter regulating the attention paid to position versus content information.
Ranked #523 on Image Classification on ImageNet
1 code implementation • NeurIPS 2021 • Stéphane d'Ascoli, Marylou Gabrié, Levent Sagun, Giulio Biroli
One of the central puzzles in modern machine learning is the ability of heavily overparametrized models to generalize well.
no code implementations • 25 Jun 2020 • Levent Sagun, Caglar Gulcehre, Adriana Romero, Negar Rostamzadeh, Stefano Sarao Mannelli
Science meets Engineering in Deep Learning took place in Vancouver as part of the Workshop section of NeurIPS 2019.
1 code implementation • NeurIPS 2020 • Stéphane d'Ascoli, Levent Sagun, Giulio Biroli
We show that this peak is implicitly regularized by the nonlinearity, which is why it only becomes salient at high noise and is weakly affected by explicit regularization.
no code implementations • 29 Nov 2019 • Umut Şimşekli, Mert Gürbüzbalaban, Thanh Huy Nguyen, Gaël Richard, Levent Sagun
This assumption is often made for mathematical convenience, since it enables SGD to be analyzed as a stochastic differential equation (SDE) driven by a Brownian motion.
1 code implementation • NeurIPS 2019 • Stéphane d'Ascoli, Levent Sagun, Joan Bruna, Giulio Biroli
The aim of this work is to understand this fact through the lens of dynamics in the loss landscape.
1 code implementation • 18 Jan 2019 • Umut Simsekli, Levent Sagun, Mert Gurbuzbalaban
This assumption is often made for mathematical convenience, since it enables SGD to be analyzed as a stochastic differential equation (SDE) driven by a Brownian motion.
1 code implementation • 6 Jan 2019 • Mario Geiger, Arthur Jacot, Stefano Spigler, Franck Gabriel, Levent Sagun, Stéphane d'Ascoli, Giulio Biroli, Clément Hongler, Matthieu Wyart
At this threshold, we argue that $\|f_{N}\|$ diverges.
no code implementations • 22 Oct 2018 • Stefano Spigler, Mario Geiger, Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Matthieu Wyart
We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved.
2 code implementations • 25 Sep 2018 • Mario Geiger, Stefano Spigler, Stéphane d'Ascoli, Levent Sagun, Marco Baity-Jesi, Giulio Biroli, Matthieu Wyart
In the vicinity of this transition, properties of the curvature of the minima of the loss are critical.
no code implementations • ICML 2018 • Marco Baity-Jesi, Levent Sagun, Mario Geiger, Stefano Spigler, Gerard Ben Arous, Chiara Cammarota, Yann Lecun, Matthieu Wyart, Giulio Biroli
We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems.
no code implementations • ICLR 2018 • Levent Sagun, Utku Evci, V. Ugur Guney, Yann Dauphin, Leon Bottou
In particular, we present a case that links the two observations: small and large batch gradient descent appear to converge to different basins of attraction but we show that they are in fact connected through their flat region and so belong to the same basin.
3 code implementations • 18 Apr 2017 • Matthew Dunn, Levent Sagun, Mike Higgins, V. Ugur Guney, Volkan Cirik, Kyunghyun Cho
We publicly release a new large-scale dataset, called SearchQA, for machine comprehension, or question-answering.
no code implementations • 23 Mar 2017 • Andrew J. Ballard, Ritankar Das, Stefano Martiniani, Dhagash Mehta, Levent Sagun, Jacob D. Stevenson, David J. Wales
Machine learning techniques are being increasingly used as flexible non-linear fitting and prediction tools in the physical sciences.
no code implementations • 22 Nov 2016 • Levent Sagun, Leon Bottou, Yann Lecun
We look at the eigenvalues of the Hessian of a loss function before and after training.
2 code implementations • 6 Nov 2016 • Pratik Chaudhari, Anna Choromanska, Stefano Soatto, Yann Lecun, Carlo Baldassi, Christian Borgs, Jennifer Chayes, Levent Sagun, Riccardo Zecchina
This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape.
no code implementations • 19 Nov 2015 • Levent Sagun, Thomas Trogdon, Yann Lecun
Given an algorithm, which we take to be both the optimization routine and the form of the random landscape, the fluctuations of the halting time follow a distribution that, after centering and scaling, remains unchanged even when the distribution on the landscape is changed.
no code implementations • 20 Dec 2014 • Levent Sagun, V. Ugur Guney, Gerard Ben Arous, Yann Lecun
Finding minima of a real valued non-convex function over a high dimensional space is a major challenge in science.