Search Results for author: Levent Sagun

Found 26 papers, 13 papers with code

Explorations on high dimensional landscapes

no code implementations20 Dec 2014 Levent Sagun, V. Ugur Guney, Gerard Ben Arous, Yann Lecun

Finding minima of a real valued non-convex function over a high dimensional space is a major challenge in science.

Vocal Bursts Intensity Prediction

Universal halting times in optimization and machine learning

no code implementations19 Nov 2015 Levent Sagun, Thomas Trogdon, Yann Lecun

Given an algorithm, which we take to be both the optimization routine and the form of the random landscape, the fluctuations of the halting time follow a distribution that, after centering and scaling, remains unchanged even when the distribution on the landscape is changed.

BIG-bench Machine Learning

Entropy-SGD: Biasing Gradient Descent Into Wide Valleys

2 code implementations6 Nov 2016 Pratik Chaudhari, Anna Choromanska, Stefano Soatto, Yann Lecun, Carlo Baldassi, Christian Borgs, Jennifer Chayes, Levent Sagun, Riccardo Zecchina

This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape.

Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond

no code implementations22 Nov 2016 Levent Sagun, Leon Bottou, Yann Lecun

We look at the eigenvalues of the Hessian of a loss function before and after training.

Perspective: Energy Landscapes for Machine Learning

no code implementations23 Mar 2017 Andrew J. Ballard, Ritankar Das, Stefano Martiniani, Dhagash Mehta, Levent Sagun, Jacob D. Stevenson, David J. Wales

Machine learning techniques are being increasingly used as flexible non-linear fitting and prediction tools in the physical sciences.

BIG-bench Machine Learning

Empirical Analysis of the Hessian of Over-Parametrized Neural Networks

no code implementations ICLR 2018 Levent Sagun, Utku Evci, V. Ugur Guney, Yann Dauphin, Leon Bottou

In particular, we present a case that links the two observations: small and large batch gradient descent appear to converge to different basins of attraction but we show that they are in fact connected through their flat region and so belong to the same basin.

Comparing Dynamics: Deep Neural Networks versus Glassy Systems

no code implementations ICML 2018 Marco Baity-Jesi, Levent Sagun, Mario Geiger, Stefano Spigler, Gerard Ben Arous, Chiara Cammarota, Yann Lecun, Matthieu Wyart, Giulio Biroli

We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems.

A jamming transition from under- to over-parametrization affects loss landscape and generalization

no code implementations22 Oct 2018 Stefano Spigler, Mario Geiger, Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Matthieu Wyart

We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved.

A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks

1 code implementation18 Jan 2019 Umut Simsekli, Levent Sagun, Mert Gurbuzbalaban

This assumption is often made for mathematical convenience, since it enables SGD to be analyzed as a stochastic differential equation (SDE) driven by a Brownian motion.

On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks

no code implementations29 Nov 2019 Umut Şimşekli, Mert Gürbüzbalaban, Thanh Huy Nguyen, Gaël Richard, Levent Sagun

This assumption is often made for mathematical convenience, since it enables SGD to be analyzed as a stochastic differential equation (SDE) driven by a Brownian motion.

Triple descent and the two kinds of overfitting: Where & why do they appear?

1 code implementation NeurIPS 2020 Stéphane d'Ascoli, Levent Sagun, Giulio Biroli

We show that this peak is implicitly regularized by the nonlinearity, which is why it only becomes salient at high noise and is weakly affected by explicit regularization.

regression

Post-Workshop Report on Science meets Engineering in Deep Learning, NeurIPS 2019, Vancouver

no code implementations25 Jun 2020 Levent Sagun, Caglar Gulcehre, Adriana Romero, Negar Rostamzadeh, Stefano Sarao Mannelli

Science meets Engineering in Deep Learning took place in Vancouver as part of the Workshop section of NeurIPS 2019.

On the interplay between data structure and loss function in classification problems

1 code implementation NeurIPS 2021 Stéphane d'Ascoli, Marylou Gabrié, Levent Sagun, Giulio Biroli

One of the central puzzles in modern machine learning is the ability of heavily overparametrized models to generalize well.

valid

ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases

9 code implementations19 Mar 2021 Stéphane d'Ascoli, Hugo Touvron, Matthew Leavitt, Ari Morcos, Giulio Biroli, Levent Sagun

We initialise the GPSA layers to mimic the locality of convolutional layers, then give each attention head the freedom to escape locality by adjusting a gating parameter regulating the attention paid to position versus content information.

Image Classification Inductive Bias

Transformed CNNs: recasting pre-trained convolutional layers with self-attention

no code implementations10 Jun 2021 Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Ari Morcos

Finally, we experiment initializing the T-CNN from a partially trained CNN, and find that it reaches better performance than the corresponding hybrid model trained from scratch, while reducing training time.

Fairness Indicators for Systematic Assessments of Visual Feature Extractors

1 code implementation15 Feb 2022 Priya Goyal, Adriana Romero Soriano, Caner Hazirbas, Levent Sagun, Nicolas Usunier

Systematic diagnosis of fairness, harms, and biases of computer vision systems is an important step towards building socially responsible systems.

Fairness

Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision

1 code implementation16 Feb 2022 Priya Goyal, Quentin Duval, Isaac Seessel, Mathilde Caron, Ishan Misra, Levent Sagun, Armand Joulin, Piotr Bojanowski

Discriminative self-supervised learning allows training models on any random group of internet images, and possibly recover salient information that helps differentiate between the images.

 Ranked #1 on Copy Detection on Copydays strong subset (using extra training data)

Action Classification Action Recognition +12

Understanding out-of-distribution accuracies through quantifying difficulty of test samples

no code implementations28 Mar 2022 Berfin Simsek, Melissa Hall, Levent Sagun

Existing works show that although modern neural networks achieve remarkable generalization performance on the in-distribution (ID) dataset, the accuracy drops significantly on the out-of-distribution (OOD) datasets \cite{recht2018cifar, recht2019imagenet}.

Measuring and signing fairness as performance under multiple stakeholder distributions

no code implementations20 Jul 2022 David Lopez-Paz, Diane Bouchacourt, Levent Sagun, Nicolas Usunier

By highlighting connections to the literature in domain generalization, we propose to measure fairness as the ability of the system to generalize under multiple stress tests -- distributions of examples with social relevance.

Domain Generalization Fairness

Simplicity Bias Leads to Amplified Performance Disparities

no code implementations13 Dec 2022 Samuel J. Bell, Levent Sagun

Finally, we present two real-world examples of difficulty amplification in action, resulting in worse-than-expected performance disparities between groups even when using a balanced dataset.

Fairness Inductive Bias

Weisfeiler and Lehman Go Measurement Modeling: Probing the Validity of the WL Test

1 code implementation11 Jul 2023 Arjun Subramonian, Adina Williams, Maximilian Nickel, Yizhou Sun, Levent Sagun

The expressive power of graph neural networks is usually measured by comparing how many pairs of graphs or nodes an architecture can possibly distinguish as non-isomorphic to those distinguishable by the $k$-dimensional Weisfeiler-Lehman ($k$-WL) test.

Networked Inequality: Preferential Attachment Bias in Graph Neural Network Link Prediction

1 code implementation29 Sep 2023 Arjun Subramonian, Levent Sagun, Yizhou Sun

We further bridge GCN's preferential attachment bias with unfairness in link prediction and propose a new within-group fairness metric.

Fairness Link Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.