Search Results for author: Yann Lecun

Found 87 papers, 49 papers with code

The Effects of Regularization and Data Augmentation are Class Dependent

no code implementations7 Apr 2022 Randall Balestriero, Leon Bottou, Yann Lecun

The optimal amount of DA or weight decay found from cross-validation leads to disastrous model performances on some classes e. g. on Imagenet with a resnet50, the "barn spider" classification test accuracy falls from $68\%$ to $46\%$ only by introducing random crop DA during training.

Data Augmentation

projUNN: efficient method for training deep networks with unitary matrices

no code implementations10 Mar 2022 Bobak Kiani, Randall Balestriero, Yann Lecun, Seth Lloyd

In learning with recurrent or very deep feed-forward networks, employing unitary matrices in each layer can be very effective at maintaining long-range stability.

A Data-Augmentation Is Worth A Thousand Samples: Exact Quantification From Analytical Augmented Sample Moments

no code implementations16 Feb 2022 Randall Balestriero, Ishan Misra, Yann Lecun

We show that for a training loss to be stable under DA sampling, the model's saliency map (gradient of the loss with respect to the model's input) must align with the smallest eigenvector of the sample variance under the considered DA augmentation, hinting at a possible explanation on why models tend to shift their focus from edges to textures.

Data Augmentation

Neural Manifold Clustering and Embedding

1 code implementation24 Jan 2022 Zengyi Li, Yubei Chen, Yann Lecun, Friedrich T. Sommer

We argue that achieving manifold clustering with neural networks requires two essential ingredients: a domain-specific constraint that ensures the identification of the manifolds, and a learning algorithm for embedding each manifold to a linear subspace in the feature space.

Data Augmentation Representation Learning +1

Understanding Dimensional Collapse in Contrastive Self-supervised Learning

1 code implementation ICLR 2022 Li Jing, Pascal Vincent, Yann Lecun, Yuandong Tian

It has been shown that non-contrastive methods suffer from a lesser collapse problem of a different nature: dimensional collapse, whereby the embedding vectors end up spanning a lower-dimensional subspace instead of the entire available embedding space.

Contrastive Learning Learning Theory +2

Learning in High Dimension Always Amounts to Extrapolation

no code implementations18 Oct 2021 Randall Balestriero, Jerome Pesenti, Yann Lecun

The notion of interpolation and extrapolation is fundamental in various fields from deep learning to function approximation.

Decoupled Contrastive Learning

2 code implementations13 Oct 2021 Chun-Hsiao Yeh, Cheng-Yao Hong, Yen-Chi Hsu, Tyng-Luh Liu, Yubei Chen, Yann Lecun

By properly addressing the NPC effect, we reach a decoupled contrastive learning (DCL) objective function, significantly improving SSL efficiency.

Contrastive Learning Self-Supervised Learning

Recurrent Parameter Generators

no code implementations15 Jul 2021 Jiayun Wang, Yubei Chen, Stella X. Yu, Brian Cheung, Yann Lecun

Specifically, for a network, we create a recurrent parameter generator (RPG), from which the parameters of each convolution layer are generated.

Model Compression

Barlow Twins: Self-Supervised Learning via Redundancy Reduction

15 code implementations4 Mar 2021 Jure Zbontar, Li Jing, Ishan Misra, Yann Lecun, Stéphane Deny

This causes the embedding vectors of distorted versions of a sample to be similar, while minimizing the redundancy between the components of these vectors.

General Classification Object Detection +3

Neural Potts Model

no code implementations1 Jan 2021 Tom Sercu, Robert Verkuil, Joshua Meier, Brandon Amos, Zeming Lin, Caroline Chen, Jason Liu, Yann Lecun, Alexander Rives

We propose the Neural Potts Model objective as an amortized optimization problem.

Implicit Rank-Minimizing Autoencoder

3 code implementations NeurIPS 2020 Li Jing, Jure Zbontar, Yann Lecun

An important component of autoencoders is the method by which the information capacity of the latent representation is minimized or limited.

Image Generation Representation Learning +1

Inspirational Adversarial Image Generation

1 code implementation17 Jun 2019 Baptiste Rozière, Morgane Riviere, Olivier Teytaud, Jérémy Rapin, Yann Lecun, Camille Couprie

We design a simple optimization method to find the optimal latent parameters corresponding to the closest generation to any input inspirational image.

Image Generation

The role of over-parametrization in generalization of neural networks

1 code implementation ICLR 2019 Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann Lecun, Nathan Srebro

Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization.

Learning about an exponential amount of conditional distributions

1 code implementation NeurIPS 2019 Mohamed Ishmael Belghazi, Maxime Oquab, Yann Lecun, David Lopez-Paz

We introduce the Neural Conditioner (NC), a self-supervised machine able to learn about all the conditional distributions of a random vector $X$.

General Classification

Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic

1 code implementation ICLR 2019 Mikael Henaff, Alfredo Canziani, Yann Lecun

Learning a policy using only observational data is challenging because the distribution of states it induces at execution time may differ from the distribution observed during training.

A Spectral Regularizer for Unsupervised Disentanglement

no code implementations4 Dec 2018 Aditya Ramesh, Youngduck Choi, Yann Lecun

A generative model with a disentangled representation allows for independent control over different aspects of the output.


GLoMo: Unsupervised Learning of Transferable Relational Graphs

no code implementations NeurIPS 2018 Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan R. Salakhutdinov, Yann Lecun

We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden units), or embedding-free units such as image pixels.

Image Classification Natural Language Inference +4

Adversarially-Trained Normalized Noisy-Feature Auto-Encoder for Text Generation

no code implementations10 Nov 2018 Xiang Zhang, Yann Lecun

An ATNNFAE consists of an auto-encoder where the internal code is normalized on the unit sphere and corrupted by additive noise.

Text Generation

Learning with Reflective Likelihoods

no code implementations27 Sep 2018 Adji B. Dieng, Kyunghyun Cho, David M. Blei, Yann Lecun

Furthermore, the reflective likelihood objective prevents posterior collapse when used to train stochastic auto-encoders with amortized inference.

Comparing Dynamics: Deep Neural Networks versus Glassy Systems

no code implementations ICML 2018 Marco Baity-Jesi, Levent Sagun, Mario Geiger, Stefano Spigler, Gerard Ben Arous, Chiara Cammarota, Yann Lecun, Matthieu Wyart, Giulio Biroli

We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems.

GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations

1 code implementation14 Jun 2018 Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan Salakhutdinov, Yann Lecun

We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden unit), or embedding-free units such as image pixels.

Image Classification Natural Language Inference +4

Backpropagation for Implicit Spectral Densities

1 code implementation1 Jun 2018 Aditya Ramesh, Yann Lecun

We introduce a tool that allows us to do this even when the likelihood is not explicitly set, by instead using the implicit likelihood of the model.

Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks

2 code implementations30 May 2018 Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann Lecun, Nathan Srebro

Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization.

DeSIGN: Design Inspiration from Generative Networks

1 code implementation3 Apr 2018 Othman Sbai, Mohamed Elhoseiny, Antoine Bordes, Yann Lecun, Camille Couprie

Can an algorithm create original and compelling fashion designs to serve as an inspirational assistant?

Image Generation

Byte-Level Recursive Convolutional Auto-Encoder for Text

1 code implementation ICLR 2018 Xiang Zhang, Yann Lecun

The proposed model is a multi-stage deep convolutional encoder-decoder framework using residual connections, containing up to 160 parameterized layers.

Text Generation

Prediction Under Uncertainty with Error Encoding Networks

no code implementations ICLR 2018 Mikael Henaff, Junbo Zhao, Yann Lecun

In this work we introduce a new framework for performing temporal predictions in the presence of uncertainty.

Video Prediction

Prediction Under Uncertainty with Error-Encoding Networks

4 code implementations14 Nov 2017 Mikael Henaff, Junbo Zhao, Yann Lecun

In this work we introduce a new framework for performing temporal predictions in the presence of uncertainty.

Video Prediction

A hierarchical loss and its problems when classifying non-hierarchically

no code implementations1 Sep 2017 Cinna Wu, Mark Tygert, Yann Lecun

We define a metric that, inter alia, can penalize failure to distinguish between a sheepdog and a skyscraper more than failure to distinguish between a sheepdog and a poodle.

General Classification

Which Encoding is the Best for Text Classification in Chinese, English, Japanese and Korean?

3 code implementations8 Aug 2017 Xiang Zhang, Yann Lecun

This article offers an empirical study on the different ways of encoding Chinese, Japanese, Korean (CJK) and English languages for text classification.

General Classification Text Classification

Adversarially Regularized Autoencoders

6 code implementations13 Jun 2017 Jake Zhao, Yoon Kim, Kelly Zhang, Alexander M. Rush, Yann Lecun

This adversarially regularized autoencoder (ARAE) allows us to generate natural textual outputs as well as perform manipulations in the latent space to induce change in the output space.

Representation Learning Style Transfer

Model-Based Planning with Discrete and Continuous Actions

1 code implementation19 May 2017 Mikael Henaff, William F. Whitney, Yann Lecun

Action planning using learned and differentiable forward models of the world is a general approach which has a number of desirable properties, including improved sample complexity over model-free RL methods, reuse of learned models across different tasks, and the ability to perform efficient gradient-based optimization in continuous action spaces.

Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs

6 code implementations ICML 2017 Li Jing, Yichen Shen, Tena Dubček, John Peurifoy, Scott Skirlo, Yann Lecun, Max Tegmark, Marin Soljačić

Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data.


Tracking the World State with Recurrent Entity Networks

4 code implementations12 Dec 2016 Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, Yann Lecun

The EntNet sets a new state-of-the-art on the bAbI tasks, and is the first method to solve all the tasks in the 10k training examples setting.

Procedural Text Understanding Question Answering

Disentangling factors of variation in deep representation using adversarial training

no code implementations NeurIPS 2016 Michael F. Mathieu, Junbo Jake Zhao, Junbo Zhao, Aditya Ramesh, Pablo Sprechmann, Yann Lecun

The only available source of supervision during the training process comes from our ability to distinguish among different observations belonging to the same category.

Geometric deep learning: going beyond Euclidean data

no code implementations24 Nov 2016 Michael M. Bronstein, Joan Bruna, Yann Lecun, Arthur Szlam, Pierre Vandergheynst

In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions), and are natural targets for machine learning techniques.

Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond

no code implementations22 Nov 2016 Levent Sagun, Leon Bottou, Yann Lecun

We look at the eigenvalues of the Hessian of a loss function before and after training.

Disentangling factors of variation in deep representations using adversarial training

3 code implementations10 Nov 2016 Michael Mathieu, Junbo Zhao, Pablo Sprechmann, Aditya Ramesh, Yann Lecun

During training, the only available source of supervision comes from our ability to distinguish among different observations belonging to the same class.


Entropy-SGD: Biasing Gradient Descent Into Wide Valleys

2 code implementations6 Nov 2016 Pratik Chaudhari, Anna Choromanska, Stefano Soatto, Yann Lecun, Carlo Baldassi, Christian Borgs, Jennifer Chayes, Levent Sagun, Riccardo Zecchina

This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape.

Energy-based Generative Adversarial Network

3 code implementations11 Sep 2016 Junbo Zhao, Michael Mathieu, Yann Lecun

We introduce the "Energy-based Generative Adversarial Network" model (EBGAN) which views the discriminator as an energy function that attributes low energies to the regions near the data manifold and higher energies to other regions.

What is the Best Feature Learning Procedure in Hierarchical Recognition Architectures?

no code implementations5 Jun 2016 Kevin Jarrett, Koray Kvukcuoglu, Karol Gregor, Yann Lecun

We also introduce a new single phase supervised learning procedure that places an L1 penalty on the output state of each layer of the network.

Object Recognition Unsupervised Pre-training

Recurrent Orthogonal Networks and Long-Memory Tasks

1 code implementation22 Feb 2016 Mikael Henaff, Arthur Szlam, Yann Lecun

Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research.

Universal halting times in optimization and machine learning

no code implementations19 Nov 2015 Levent Sagun, Thomas Trogdon, Yann Lecun

Given an algorithm, which we take to be both the optimization routine and the form of the random landscape, the fluctuations of the halting time follow a distribution that, after centering and scaling, remains unchanged even when the distribution on the landscape is changed.

Super-Resolution with Deep Convolutional Sufficient Statistics

1 code implementation18 Nov 2015 Joan Bruna, Pablo Sprechmann, Yann Lecun

Inverse problems in image and audio, and super-resolution in particular, can be seen as high-dimensional structured prediction problems, where the goal is to characterize the conditional distribution of a high-resolution output given its low-resolution corrupted observation.

Bandwidth Extension Image Super-Resolution +1

Deep multi-scale video prediction beyond mean square error

5 code implementations17 Nov 2015 Michael Mathieu, Camille Couprie, Yann Lecun

Learning to predict future images from a video sequence involves the construction of an internal representation that models the image evolution accurately, and therefore, to some degree, its content and dynamics.

Frame Optical Flow Estimation +1

Binary embeddings with structured hashed projections

no code implementations16 Nov 2015 Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski, Tony Jebara, Sanjiv Kumar, Yann Lecun

We prove several theoretical results showing that projections via various structured matrices followed by nonlinear mappings accurately preserve the angular distance between input high-dimensional vectors.

Universum Prescription: Regularization using Unlabeled Data

no code implementations11 Nov 2015 Xiang Zhang, Yann Lecun

This paper shows that simply prescribing "none of the above" labels to unlabeled data has a beneficial regularization effect to supervised learning.

Image Classification

Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches

1 code implementation20 Oct 2015 Jure Žbontar, Yann Lecun

We approach the problem by learning a similarity measure on small image patches using a convolutional neural network.

Stereo Matching Stereo Matching Hand

Very Deep Multilingual Convolutional Neural Networks for LVCSR

no code implementations29 Sep 2015 Tom Sercu, Christian Puhrsch, Brian Kingsbury, Yann Lecun

However, CNNs in LVCSR have not kept pace with recent advances in other domains where deeper neural networks provide superior performance.

Speech Recognition

Deep Convolutional Networks on Graph-Structured Data

3 code implementations16 Jun 2015 Mikael Henaff, Joan Bruna, Yann Lecun

Deep Learning's recent successes have mostly relied on Convolutional Networks, which exploit fundamental statistical properties of images, sounds and video data: the local stationarity and multi-scale compositional structure, that allows expressing long range interactions in terms of shorter, localized interactions.

General Classification

Learning to Linearize Under Uncertainty

no code implementations NeurIPS 2015 Ross Goroshin, Michael Mathieu, Yann Lecun

Training deep feature hierarchies to solve supervised learning tasks has achieved state of the art performance on many problems in computer vision.

Stacked What-Where Auto-encoders

2 code implementations8 Jun 2015 Junbo Zhao, Michael Mathieu, Ross Goroshin, Yann Lecun

The objective function includes reconstruction terms that induce the hidden states in the Deconvnet to be similar to those of the Convnet.

Semi-Supervised Image Classification

A mathematical motivation for complex-valued convolutional networks

no code implementations11 Mar 2015 Joan Bruna, Soumith Chintala, Yann Lecun, Serkan Piantino, Arthur Szlam, Mark Tygert

Courtesy of the exact correspondence, the remarkably rich and rigorous body of mathematical analysis for wavelets applies directly to (complex-valued) convnets.

Text Understanding from Scratch

3 code implementations5 Feb 2015 Xiang Zhang, Yann Lecun

This article demontrates that we can apply deep learning to text understanding from character-level inputs all the way up to abstract text concepts, using temporal convolutional networks (ConvNets).

General Classification Sentiment Analysis

Fast Convolutional Nets With fbfft: A GPU Performance Evaluation

2 code implementations24 Dec 2014 Nicolas Vasilache, Jeff Johnson, Michael Mathieu, Soumith Chintala, Serkan Piantino, Yann Lecun

We examine the performance profile of Convolutional Neural Network training on the current generation of NVIDIA Graphics Processing Units.

Audio Source Separation with Discriminative Scattering Networks

no code implementations22 Dec 2014 Pablo Sprechmann, Joan Bruna, Yann Lecun

In this report we describe an ongoing line of research for solving single-channel source separation problems.

Audio Source Separation Frame

Deep learning with Elastic Averaging SGD

9 code implementations NeurIPS 2015 Sixin Zhang, Anna Choromanska, Yann Lecun

We empirically demonstrate that in the deep learning setting, due to the existence of many local optima, allowing more exploration can lead to the improved performance.

Image Classification Stochastic Optimization

Explorations on high dimensional landscapes

no code implementations20 Dec 2014 Levent Sagun, V. Ugur Guney, Gerard Ben Arous, Yann Lecun

Finding minima of a real valued non-convex function over a high dimensional space is a major challenge in science.

The Loss Surfaces of Multilayer Networks

1 code implementation30 Nov 2014 Anna Choromanska, Mikael Henaff, Michael Mathieu, Gérard Ben Arous, Yann Lecun

We show that for large-size decoupled networks the lowest critical values of the random loss function form a layered structure and they are located in a well-defined band lower-bounded by the global minimum.

Differentially- and non-differentially-private random decision trees

no code implementations26 Oct 2014 Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Yann Lecun

We consider supervised learning with random decision trees, where the tree construction is completely random.

MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation

no code implementations28 Sep 2014 Arjun Jain, Jonathan Tompson, Yann Lecun, Christoph Bregler

In this work, we propose a novel and efficient method for articulated human pose estimation in videos using a convolutional network architecture, which incorporates both color and motion features.

Pose Estimation

Fast Approximation of Rotations and Hessians matrices

no code implementations29 Apr 2014 Michael Mathieu, Yann Lecun

A new method to represent and approximate rotation matrices is introduced.

Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation

no code implementations NeurIPS 2014 Emily Denton, Wojciech Zaremba, Joan Bruna, Yann Lecun, Rob Fergus

We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks.

Object Recognition

Spectral Networks and Locally Connected Networks on Graphs

4 code implementations21 Dec 2013 Joan Bruna, Wojciech Zaremba, Arthur Szlam, Yann Lecun

Convolutional Neural Networks are extremely efficient architectures in image and audio recognition tasks, thanks to their ability to exploit the local translational invariance of signal classes over their domain.


OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

4 code implementations21 Dec 2013 Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fergus, Yann Lecun

This integrated framework is the winner of the localization task of the ImageNet Large Scale Visual Recognition Challenge 2013 (ILSVRC2013) and obtained very competitive results for the detection and classifications tasks.

General Classification Image Classification +2

Fast Training of Convolutional Networks through FFTs

no code implementations20 Dec 2013 Michael Mathieu, Mikael Henaff, Yann Lecun

Convolutional networks are one of the most widely employed architectures in computer vision and machine learning.

Understanding Deep Architectures using a Recursive Convolutional Network

no code implementations6 Dec 2013 David Eigen, Jason Rolfe, Rob Fergus, Yann Lecun

A key challenge in designing convolutional network models is sizing them appropriately.

Signal Recovery from Pooling Representations

no code implementations16 Nov 2013 Joan Bruna, Arthur Szlam, Yann Lecun

In this work we compute lower Lipschitz bounds of $\ell_p$ pooling operators for $p=1, 2, \infty$ as well as $\ell_p$ pooling operators preceded by half-rectification layers.

Adaptive learning rates and parallelization for stochastic, sparse, non-smooth gradients

no code implementations16 Jan 2013 Tom Schaul, Yann Lecun

Recent work has established an empirically successful framework for adapting learning rates for stochastic gradient descent (SGD).

No More Pesky Learning Rates

no code implementations6 Jun 2012 Tom Schaul, Sixin Zhang, Yann Lecun

The performance of stochastic gradient descent (SGD) depends critically on how learning rates are tuned and decreased over time.

Cannot find the paper you are looking for? You can Submit a new open access paper.