no code implementations • 7 Apr 2022 • Randall Balestriero, Leon Bottou, Yann Lecun
The optimal amount of DA or weight decay found from cross-validation leads to disastrous model performances on some classes e. g. on Imagenet with a resnet50, the "barn spider" classification test accuracy falls from $68\%$ to $46\%$ only by introducing random crop DA during training.
no code implementations • 10 Mar 2022 • Bobak Kiani, Randall Balestriero, Yann Lecun, Seth Lloyd
In learning with recurrent or very deep feed-forward networks, employing unitary matrices in each layer can be very effective at maintaining long-range stability.
no code implementations • 16 Feb 2022 • Randall Balestriero, Ishan Misra, Yann Lecun
We show that for a training loss to be stable under DA sampling, the model's saliency map (gradient of the loss with respect to the model's input) must align with the smallest eigenvector of the sample variance under the considered DA augmentation, hinting at a possible explanation on why models tend to shift their focus from edges to textures.
1 code implementation • 24 Jan 2022 • Zengyi Li, Yubei Chen, Yann Lecun, Friedrich T. Sommer
We argue that achieving manifold clustering with neural networks requires two essential ingredients: a domain-specific constraint that ensures the identification of the manifolds, and a learning algorithm for embedding each manifold to a linear subspace in the feature space.
1 code implementation • ICLR 2022 • Li Jing, Pascal Vincent, Yann Lecun, Yuandong Tian
It has been shown that non-contrastive methods suffer from a lesser collapse problem of a different nature: dimensional collapse, whereby the embedding vectors end up spanning a lower-dimensional subspace instead of the entire available embedding space.
no code implementations • 18 Oct 2021 • Randall Balestriero, Jerome Pesenti, Yann Lecun
The notion of interpolation and extrapolation is fundamental in various fields from deep learning to function approximation.
2 code implementations • 13 Oct 2021 • Chun-Hsiao Yeh, Cheng-Yao Hong, Yen-Chi Hsu, Tyng-Luh Liu, Yubei Chen, Yann Lecun
By properly addressing the NPC effect, we reach a decoupled contrastive learning (DCL) objective function, significantly improving SSL efficiency.
no code implementations • 15 Jul 2021 • Jiayun Wang, Yubei Chen, Stella X. Yu, Brian Cheung, Yann Lecun
Specifically, for a network, we create a recurrent parameter generator (RPG), from which the parameters of each convolution layer are generated.
4 code implementations • NeurIPS 2021 • Adrien Bardes, Jean Ponce, Yann Lecun
Recent self-supervised methods for image representation learning are based on maximizing the agreement between embedding vectors from different views of the same image.
Representation Learning
Self-Supervised Image Classification
+2
1 code implementation • 26 Apr 2021 • Aishwarya Kamath, Mannat Singh, Yann Lecun, Gabriel Synnaeve, Ishan Misra, Nicolas Carion
We also investigate the utility of our model as an object detector on a given label set when fine-tuned in a few-shot setting.
Ranked #1 on
Visual Question Answering
on CLEVR-Humans
no code implementations • NAACL (DeeLIO) 2021 • Zeyu Yun, Yubei Chen, Bruno A Olshausen, Yann Lecun
Transformer networks have revolutionized NLP representation learning since they were introduced.
15 code implementations • 4 Mar 2021 • Jure Zbontar, Li Jing, Ishan Misra, Yann Lecun, Stéphane Deny
This causes the embedding vectors of distorted versions of a sample to be similar, while minimizing the redundancy between the components of these vectors.
Ranked #8 on
Image Classification
on Places205
1 code implementation • ICCV 2021 • Aishwarya Kamath, Mannat Singh, Yann Lecun, Gabriel Synnaeve, Ishan Misra, Nicolas Carion
We also investigate the utility of our model as an object detector on a given label set when fine-tuned in a few-shot setting.
Ranked #1 on
Referring Expression Comprehension
on Talk2Car
(using extra training data)
no code implementations • 1 Jan 2021 • Tom Sercu, Robert Verkuil, Joshua Meier, Brandon Amos, Zeming Lin, Caroline Chen, Jason Liu, Yann Lecun, Alexander Rives
We propose the Neural Potts Model objective as an amortized optimization problem.
3 code implementations • NeurIPS 2020 • Li Jing, Jure Zbontar, Yann Lecun
An important component of autoencoders is the method by which the information capacity of the latent representation is minimized or limited.
1 code implementation • 17 Jun 2019 • Baptiste Rozière, Morgane Riviere, Olivier Teytaud, Jérémy Rapin, Yann Lecun, Camille Couprie
We design a simple optimization method to find the optimal latent parameters corresponding to the closest generation to any input inspirational image.
1 code implementation • ICLR 2019 • Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann Lecun, Nathan Srebro
Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization.
1 code implementation • CVPR 2019 • Huy V. Vo, Francis Bach, Minsu Cho, Kai Han, Yann Lecun, Patrick Perez, Jean Ponce
Learning with complete or partial supervision is powerful but relies on ever-growing human annotation efforts.
Ranked #2 on
Single-object colocalization
on Object Discovery
1 code implementation • NeurIPS 2019 • Mohamed Ishmael Belghazi, Maxime Oquab, Yann Lecun, David Lopez-Paz
We introduce the Neural Conditioner (NC), a self-supervised machine able to learn about all the conditional distributions of a random vector $X$.
1 code implementation • ICLR 2019 • Mikael Henaff, Alfredo Canziani, Yann Lecun
Learning a policy using only observational data is challenging because the distribution of states it induces at execution time may differ from the distribution observed during training.
no code implementations • 4 Dec 2018 • Aditya Ramesh, Youngduck Choi, Yann Lecun
A generative model with a disentangled representation allows for independent control over different aspects of the output.
no code implementations • NeurIPS 2018 • Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan R. Salakhutdinov, Yann Lecun
We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden units), or embedding-free units such as image pixels.
no code implementations • 10 Nov 2018 • Xiang Zhang, Yann Lecun
An ATNNFAE consists of an auto-encoder where the internal code is normalized on the unit sphere and corrupted by additive noise.
no code implementations • 27 Sep 2018 • Adji B. Dieng, Kyunghyun Cho, David M. Blei, Yann Lecun
Furthermore, the reflective likelihood objective prevents posterior collapse when used to train stochastic auto-encoders with amortized inference.
no code implementations • ICML 2018 • Marco Baity-Jesi, Levent Sagun, Mario Geiger, Stefano Spigler, Gerard Ben Arous, Chiara Cammarota, Yann Lecun, Matthieu Wyart, Giulio Biroli
We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems.
1 code implementation • 14 Jun 2018 • Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan Salakhutdinov, Yann Lecun
We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden unit), or embedding-free units such as image pixels.
1 code implementation • 1 Jun 2018 • Aditya Ramesh, Yann Lecun
We introduce a tool that allows us to do this even when the likelihood is not explicitly set, by instead using the implicit likelihood of the model.
2 code implementations • 30 May 2018 • Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann Lecun, Nathan Srebro
Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization.
1 code implementation • 3 Apr 2018 • Othman Sbai, Mohamed Elhoseiny, Antoine Bordes, Yann Lecun, Camille Couprie
Can an algorithm create original and compelling fashion designs to serve as an inspirational assistant?
1 code implementation • ECCV 2018 • Pauline Luc, Camille Couprie, Yann Lecun, Jakob Verbeek
We apply the "detection head'" of Mask R-CNN on the predicted features to produce the instance segmentation of future frames.
1 code implementation • ICLR 2018 • Xiang Zhang, Yann Lecun
The proposed model is a multi-stage deep convolutional encoder-decoder framework using residual connections, containing up to 160 parameterized layers.
no code implementations • ICLR 2018 • Mikael Henaff, Junbo Zhao, Yann Lecun
In this work we introduce a new framework for performing temporal predictions in the presence of uncertainty.
12 code implementations • CVPR 2018 • Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann Lecun, Manohar Paluri
In this paper we discuss several forms of spatiotemporal convolutions for video analysis and study their effects on action recognition.
Ranked #4 on
Action Recognition
on Sports-1M
4 code implementations • 14 Nov 2017 • Mikael Henaff, Junbo Zhao, Yann Lecun
In this work we introduce a new framework for performing temporal predictions in the presence of uncertainty.
no code implementations • 1 Sep 2017 • Cinna Wu, Mark Tygert, Yann Lecun
We define a metric that, inter alia, can penalize failure to distinguish between a sheepdog and a skyscraper more than failure to distinguish between a sheepdog and a poodle.
3 code implementations • 8 Aug 2017 • Xiang Zhang, Yann Lecun
This article offers an empirical study on the different ways of encoding Chinese, Japanese, Korean (CJK) and English languages for text classification.
6 code implementations • 13 Jun 2017 • Jake Zhao, Yoon Kim, Kelly Zhang, Alexander M. Rush, Yann Lecun
This adversarially regularized autoencoder (ARAE) allows us to generate natural textual outputs as well as perform manipulations in the latent space to induce change in the output space.
1 code implementation • 19 May 2017 • Mikael Henaff, William F. Whitney, Yann Lecun
Action planning using learned and differentiable forward models of the world is a general approach which has a number of desirable properties, including improved sample complexity over model-free RL methods, reuse of learned models across different tasks, and the ability to perform efficient gradient-based optimization in continuous action spaces.
2 code implementations • ICCV 2017 • Pauline Luc, Natalia Neverova, Camille Couprie, Jakob Verbeek, Yann Lecun
The ability to predict and therefore to anticipate the future is an important attribute of intelligence.
6 code implementations • ICML 2017 • Li Jing, Yichen Shen, Tena Dubček, John Peurifoy, Scott Skirlo, Yann Lecun, Max Tegmark, Marin Soljačić
Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data.
4 code implementations • 12 Dec 2016 • Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, Yann Lecun
The EntNet sets a new state-of-the-art on the bAbI tasks, and is the first method to solve all the tasks in the 10k training examples setting.
Ranked #3 on
Question Answering
on bAbi
no code implementations • NeurIPS 2016 • Michael F. Mathieu, Junbo Jake Zhao, Junbo Zhao, Aditya Ramesh, Pablo Sprechmann, Yann Lecun
The only available source of supervision during the training process comes from our ability to distinguish among different observations belonging to the same category.
no code implementations • 24 Nov 2016 • Michael M. Bronstein, Joan Bruna, Yann Lecun, Arthur Szlam, Pierre Vandergheynst
In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions), and are natural targets for machine learning techniques.
no code implementations • 22 Nov 2016 • Levent Sagun, Leon Bottou, Yann Lecun
We look at the eigenvalues of the Hessian of a loss function before and after training.
3 code implementations • 10 Nov 2016 • Michael Mathieu, Junbo Zhao, Pablo Sprechmann, Aditya Ramesh, Yann Lecun
During training, the only available source of supervision comes from our ability to distinguish among different observations belonging to the same class.
2 code implementations • 6 Nov 2016 • Pratik Chaudhari, Anna Choromanska, Stefano Soatto, Yann Lecun, Carlo Baldassi, Christian Borgs, Jennifer Chayes, Levent Sagun, Riccardo Zecchina
This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape.
3 code implementations • 11 Sep 2016 • Junbo Zhao, Michael Mathieu, Yann Lecun
We introduce the "Energy-based Generative Adversarial Network" model (EBGAN) which views the discriminator as an energy function that attributes low energies to the regions near the data manifold and higher energies to other regions.
23 code implementations • EACL 2017 • Alexis Conneau, Holger Schwenk, Loïc Barrault, Yann Lecun
The dominant approach for many NLP tasks are recurrent neural networks, in particular LSTMs, and convolutional neural networks.
Ranked #17 on
Text Classification
on AG News
no code implementations • 5 Jun 2016 • Kevin Jarrett, Koray Kvukcuoglu, Karol Gregor, Yann Lecun
We also introduce a new single phase supervised learning procedure that places an L1 penalty on the output state of each layer of the network.
1 code implementation • 22 Feb 2016 • Mikael Henaff, Arthur Szlam, Yann Lecun
Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research.
no code implementations • 19 Nov 2015 • Levent Sagun, Thomas Trogdon, Yann Lecun
Given an algorithm, which we take to be both the optimization routine and the form of the random landscape, the fluctuations of the halting time follow a distribution that, after centering and scaling, remains unchanged even when the distribution on the landscape is changed.
1 code implementation • 18 Nov 2015 • Joan Bruna, Pablo Sprechmann, Yann Lecun
Inverse problems in image and audio, and super-resolution in particular, can be seen as high-dimensional structured prediction problems, where the goal is to characterize the conditional distribution of a high-resolution output given its low-resolution corrupted observation.
5 code implementations • 17 Nov 2015 • Michael Mathieu, Camille Couprie, Yann Lecun
Learning to predict future images from a video sequence involves the construction of an internal representation that models the image evolution accurately, and therefore, to some degree, its content and dynamics.
no code implementations • 16 Nov 2015 • Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski, Tony Jebara, Sanjiv Kumar, Yann Lecun
We prove several theoretical results showing that projections via various structured matrices followed by nonlinear mappings accurately preserve the angular distance between input high-dimensional vectors.
no code implementations • 11 Nov 2015 • Xiang Zhang, Yann Lecun
This paper shows that simply prescribing "none of the above" labels to unlabeled data has a beneficial regularization effect to supervised learning.
Ranked #135 on
Image Classification
on CIFAR-10
1 code implementation • 20 Oct 2015 • Jure Žbontar, Yann Lecun
We approach the problem by learning a similarity measure on small image patches using a convolutional neural network.
no code implementations • 29 Sep 2015 • Tom Sercu, Christian Puhrsch, Brian Kingsbury, Yann Lecun
However, CNNs in LVCSR have not kept pace with recent advances in other domains where deeper neural networks provide superior performance.
Ranked #17 on
Speech Recognition
on Switchboard + Hub500
28 code implementations • NeurIPS 2015 • Xiang Zhang, Junbo Zhao, Yann Lecun
This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification.
Ranked #16 on
Sentiment Analysis
on Yelp Fine-grained classification
3 code implementations • 16 Jun 2015 • Mikael Henaff, Joan Bruna, Yann Lecun
Deep Learning's recent successes have mostly relied on Convolutional Networks, which exploit fundamental statistical properties of images, sounds and video data: the local stationarity and multi-scale compositional structure, that allows expressing long range interactions in terms of shorter, localized interactions.
no code implementations • NeurIPS 2015 • Ross Goroshin, Michael Mathieu, Yann Lecun
Training deep feature hierarchies to solve supervised learning tasks has achieved state of the art performance on many problems in computer vision.
2 code implementations • 8 Jun 2015 • Junbo Zhao, Michael Mathieu, Ross Goroshin, Yann Lecun
The objective function includes reconstruction terms that induce the hidden states in the Deconvnet to be similar to those of the Convnet.
no code implementations • 9 Apr 2015 • Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann Lecun
Current state-of-the-art classification and detection algorithms rely on supervised training.
no code implementations • 11 Mar 2015 • Joan Bruna, Soumith Chintala, Yann Lecun, Serkan Piantino, Arthur Szlam, Mark Tygert
Courtesy of the exact correspondence, the remarkably rich and rigorous body of mathematical analysis for wavelets applies directly to (complex-valued) convnets.
3 code implementations • 5 Feb 2015 • Xiang Zhang, Yann Lecun
This article demontrates that we can apply deep learning to text understanding from character-level inputs all the way up to abstract text concepts, using temporal convolutional networks (ConvNets).
2 code implementations • 24 Dec 2014 • Nicolas Vasilache, Jeff Johnson, Michael Mathieu, Soumith Chintala, Serkan Piantino, Yann Lecun
We examine the performance profile of Convolutional Neural Network training on the current generation of NVIDIA Graphics Processing Units.
no code implementations • 22 Dec 2014 • Pablo Sprechmann, Joan Bruna, Yann Lecun
In this report we describe an ongoing line of research for solving single-channel source separation problems.
9 code implementations • NeurIPS 2015 • Sixin Zhang, Anna Choromanska, Yann Lecun
We empirically demonstrate that in the deep learning setting, due to the existence of many local optima, allowing more exploration can lead to the improved performance.
no code implementations • 20 Dec 2014 • Levent Sagun, V. Ugur Guney, Gerard Ben Arous, Yann Lecun
Finding minima of a real valued non-convex function over a high dimensional space is a major challenge in science.
no code implementations • ICCV 2015 • Ross Goroshin, Joan Bruna, Jonathan Tompson, David Eigen, Yann Lecun
Current state-of-the-art classification and detection algorithms rely on supervised training.
1 code implementation • 30 Nov 2014 • Anna Choromanska, Mikael Henaff, Michael Mathieu, Gérard Ben Arous, Yann Lecun
We show that for large-size decoupled networks the lowest critical values of the random loss function form a layered structure and they are located in a well-defined band lower-bounded by the global minimum.
2 code implementations • CVPR 2015 • Jonathan Tompson, Ross Goroshin, Arjun Jain, Yann Lecun, Christopher Bregler
Recent state-of-the-art performance on human-body pose estimation has been achieved with Deep Convolutional Networks (ConvNets).
Ranked #33 on
Pose Estimation
on MPII Human Pose
no code implementations • 26 Oct 2014 • Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Yann Lecun
We consider supervised learning with random decision trees, where the tree construction is completely random.
no code implementations • 28 Sep 2014 • Arjun Jain, Jonathan Tompson, Yann Lecun, Christoph Bregler
In this work, we propose a novel and efficient method for articulated human pose estimation in videos using a convolutional network architecture, which incorporates both color and motion features.
1 code implementation • CVPR 2015 • Jure Žbontar, Yann Lecun
We present a method for extracting depth information from a rectified image pair.
1 code implementation • NeurIPS 2014 • Jonathan Tompson, Arjun Jain, Yann Lecun, Christoph Bregler
This paper proposes a new hybrid architecture that consists of a deep Convolutional Network and a Markov Random Field.
no code implementations • 29 Apr 2014 • Michael Mathieu, Yann Lecun
A new method to represent and approximate rotation matrices is introduced.
no code implementations • NeurIPS 2014 • Emily Denton, Wojciech Zaremba, Joan Bruna, Yann Lecun, Rob Fergus
We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks.
4 code implementations • 21 Dec 2013 • Joan Bruna, Wojciech Zaremba, Arthur Szlam, Yann Lecun
Convolutional Neural Networks are extremely efficient architectures in image and audio recognition tasks, thanks to their ability to exploit the local translational invariance of signal classes over their domain.
4 code implementations • 21 Dec 2013 • Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fergus, Yann Lecun
This integrated framework is the winner of the localization task of the ImageNet Large Scale Visual Recognition Challenge 2013 (ILSVRC2013) and obtained very competitive results for the detection and classifications tasks.
Ranked #580 on
Image Classification
on ImageNet
no code implementations • 20 Dec 2013 • Michael Mathieu, Mikael Henaff, Yann Lecun
Convolutional networks are one of the most widely employed architectures in computer vision and machine learning.
no code implementations • 6 Dec 2013 • David Eigen, Jason Rolfe, Rob Fergus, Yann Lecun
A key challenge in designing convolutional network models is sizing them appropriately.
no code implementations • 16 Nov 2013 • Joan Bruna, Arthur Szlam, Yann Lecun
In this work we compute lower Lipschitz bounds of $\ell_p$ pooling operators for $p=1, 2, \infty$ as well as $\ell_p$ pooling operators preceded by half-rectification layers.
1 code implementation • ICML'13: Proceedings of the 30th International Conference on International Conference on Machine Learning - Volume 28 2013 • Li Wan, Matthew Zeiler, Sixin Zhang, Yann Lecun, Rob Fergus
When training with Dropout, a randomly selected subset of activations are set to zero within each layer.
Ranked #7 on
Image Classification
on MNIST
no code implementations • 16 Jan 2013 • Tom Schaul, Yann Lecun
Recent work has established an empirically successful framework for adapting learning rates for stochastic gradient descent (SGD).
no code implementations • CVPR 2013 • Pierre Sermanet, Koray Kavukcuoglu, Soumith Chintala, Yann Lecun
Pedestrian detection is a problem of considerable practical interest.
no code implementations • 6 Jun 2012 • Tom Schaul, Sixin Zhang, Yann Lecun
The performance of stochastic gradient descent (SGD) depends critically on how learning rates are tuned and decreased over time.
2 code implementations • 18 Apr 2012 • Pierre Sermanet, Soumith Chintala, Yann Lecun
We classify digits of real-world house numbers using convolutional neural networks (ConvNets).