Search Results for author: Levent Sagun

Found 22 papers, 10 papers with code

Understanding out-of-distribution accuracies through quantifying difficulty of test samples

no code implementations28 Mar 2022 Berfin Simsek, Melissa Hall, Levent Sagun

Existing works show that although modern neural networks achieve remarkable generalization performance on the in-distribution (ID) dataset, the accuracy drops significantly on the out-of-distribution (OOD) datasets \cite{recht2018cifar, recht2019imagenet}.

Vision Models Are More Robust And Fair When Pretrained On Uncurated Images Without Supervision

1 code implementation16 Feb 2022 Priya Goyal, Quentin Duval, Isaac Seessel, Mathilde Caron, Ishan Misra, Levent Sagun, Armand Joulin, Piotr Bojanowski

Discriminative self-supervised learning allows training models on any random group of internet images, and possibly recover salient information that helps differentiate between the images.

 Ranked #1 on Copy Detection on Copydays strong subset (using extra training data)

Action Classification Action Recognition +10

Fairness Indicators for Systematic Assessments of Visual Feature Extractors

1 code implementation15 Feb 2022 Priya Goyal, Adriana Romero Soriano, Caner Hazirbas, Levent Sagun, Nicolas Usunier

Systematic diagnosis of fairness, harms, and biases of computer vision systems is an important step towards building socially responsible systems.


Transformed CNNs: recasting pre-trained convolutional layers with self-attention

no code implementations10 Jun 2021 Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Ari Morcos

Finally, we experiment initializing the T-CNN from a partially trained CNN, and find that it reaches better performance than the corresponding hybrid model trained from scratch, while reducing training time.

ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases

4 code implementations19 Mar 2021 Stéphane d'Ascoli, Hugo Touvron, Matthew Leavitt, Ari Morcos, Giulio Biroli, Levent Sagun

We initialise the GPSA layers to mimic the locality of convolutional layers, then give each attention head the freedom to escape locality by adjusting a gating parameter regulating the attention paid to position versus content information.

Image Classification

On the interplay between data structure and loss function in classification problems

1 code implementation NeurIPS 2021 Stéphane d'Ascoli, Marylou Gabrié, Levent Sagun, Giulio Biroli

One of the central puzzles in modern machine learning is the ability of heavily overparametrized models to generalize well.

Post-Workshop Report on Science meets Engineering in Deep Learning, NeurIPS 2019, Vancouver

no code implementations25 Jun 2020 Levent Sagun, Caglar Gulcehre, Adriana Romero, Negar Rostamzadeh, Stefano Sarao Mannelli

Science meets Engineering in Deep Learning took place in Vancouver as part of the Workshop section of NeurIPS 2019.

Triple descent and the two kinds of overfitting: Where & why do they appear?

1 code implementation NeurIPS 2020 Stéphane d'Ascoli, Levent Sagun, Giulio Biroli

We show that this peak is implicitly regularized by the nonlinearity, which is why it only becomes salient at high noise and is weakly affected by explicit regularization.

On the Heavy-Tailed Theory of Stochastic Gradient Descent for Deep Neural Networks

no code implementations29 Nov 2019 Umut Şimşekli, Mert Gürbüzbalaban, Thanh Huy Nguyen, Gaël Richard, Levent Sagun

This assumption is often made for mathematical convenience, since it enables SGD to be analyzed as a stochastic differential equation (SDE) driven by a Brownian motion.

A Tail-Index Analysis of Stochastic Gradient Noise in Deep Neural Networks

1 code implementation18 Jan 2019 Umut Simsekli, Levent Sagun, Mert Gurbuzbalaban

This assumption is often made for mathematical convenience, since it enables SGD to be analyzed as a stochastic differential equation (SDE) driven by a Brownian motion.

A jamming transition from under- to over-parametrization affects loss landscape and generalization

no code implementations22 Oct 2018 Stefano Spigler, Mario Geiger, Stéphane d'Ascoli, Levent Sagun, Giulio Biroli, Matthieu Wyart

We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved.

Comparing Dynamics: Deep Neural Networks versus Glassy Systems

no code implementations ICML 2018 Marco Baity-Jesi, Levent Sagun, Mario Geiger, Stefano Spigler, Gerard Ben Arous, Chiara Cammarota, Yann Lecun, Matthieu Wyart, Giulio Biroli

We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems.

Empirical Analysis of the Hessian of Over-Parametrized Neural Networks

no code implementations ICLR 2018 Levent Sagun, Utku Evci, V. Ugur Guney, Yann Dauphin, Leon Bottou

In particular, we present a case that links the two observations: small and large batch gradient descent appear to converge to different basins of attraction but we show that they are in fact connected through their flat region and so belong to the same basin.

Perspective: Energy Landscapes for Machine Learning

no code implementations23 Mar 2017 Andrew J. Ballard, Ritankar Das, Stefano Martiniani, Dhagash Mehta, Levent Sagun, Jacob D. Stevenson, David J. Wales

Machine learning techniques are being increasingly used as flexible non-linear fitting and prediction tools in the physical sciences.

Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond

no code implementations22 Nov 2016 Levent Sagun, Leon Bottou, Yann Lecun

We look at the eigenvalues of the Hessian of a loss function before and after training.

Entropy-SGD: Biasing Gradient Descent Into Wide Valleys

2 code implementations6 Nov 2016 Pratik Chaudhari, Anna Choromanska, Stefano Soatto, Yann Lecun, Carlo Baldassi, Christian Borgs, Jennifer Chayes, Levent Sagun, Riccardo Zecchina

This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape.

Universal halting times in optimization and machine learning

no code implementations19 Nov 2015 Levent Sagun, Thomas Trogdon, Yann Lecun

Given an algorithm, which we take to be both the optimization routine and the form of the random landscape, the fluctuations of the halting time follow a distribution that, after centering and scaling, remains unchanged even when the distribution on the landscape is changed.

Explorations on high dimensional landscapes

no code implementations20 Dec 2014 Levent Sagun, V. Ugur Guney, Gerard Ben Arous, Yann Lecun

Finding minima of a real valued non-convex function over a high dimensional space is a major challenge in science.

Cannot find the paper you are looking for? You can Submit a new open access paper.