Search Results for author: Yann Lecun

Found 147 papers, 79 papers with code

LiveBench: A Challenging, Contamination-Free LLM Benchmark

1 code implementation27 Jun 2024 Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Ben Feuer, Siddhartha Jain, Ravid Shwartz-Ziv, Neel Jain, Khalid Saifullah, Siddartha Naidu, Chinmay Hegde, Yann Lecun, Tom Goldstein, Willie Neiswanger, Micah Goldblum

In this work, we introduce a new benchmark for LLMs designed to be immune to both test set contamination and the pitfalls of LLM judging and human crowdsourcing.

Instruction Following Math

Just How Flexible are Neural Networks in Practice?

no code implementations17 Jun 2024 Ravid Shwartz-Ziv, Micah Goldblum, Arpit Bansal, C. Bayan Bruss, Yann Lecun, Andrew Gordon Wilson

Our findings indicate that: (1) standard optimizers find minima where the model can only fit training sets with significantly fewer samples than it has parameters; (2) convolutional networks are more parameter-efficient than MLPs and ViTs, even on randomly labeled data; (3) while stochastic training is thought to have a regularizing effect, SGD actually finds minima that fit more training data than full-batch gradient descent; (4) the difference in capacity to fit correctly labeled and incorrectly labeled samples can be predictive of generalization; (5) ReLU activation functions result in finding minima that fit more data despite being designed to avoid vanishing and exploding gradients in deep architectures.

Towards an Improved Understanding and Utilization of Maximum Manifold Capacity Representations

no code implementations13 Jun 2024 Rylan Schaeffer, Victor Lecomte, Dhruv Bhandarkar Pai, Andres Carranza, Berivan Isik, Alyssa Unell, Mikail Khona, Thomas Yerxa, Yann Lecun, SueYeon Chung, Andrey Gromov, Ravid Shwartz-Ziv, Sanmi Koyejo

We then leverage tools from information theory to show that such embeddings maximize a well-known lower bound on mutual information between views, thereby connecting the geometric perspective of MMCR to the information-theoretic perspective commonly discussed in MVSSL.

Self-Supervised Learning

Hierarchical World Models as Visual Whole-Body Humanoid Controllers

no code implementations28 May 2024 Nicklas Hansen, Jyothir S V, Vlad Sobal, Yann Lecun, Xiaolong Wang, Hao Su

Whole-body control for humanoids is challenging due to the high-dimensional nature of the problem, coupled with the inherent instability of a bipedal morphology.

Humanoid Control

The Entropy Enigma: Success and Failure of Entropy Minimization

1 code implementation8 May 2024 Ori Press, Ravid Shwartz-Ziv, Yann Lecun, Matthias Bethge

After many steps of optimization, EM makes the model embed test images far away from the embeddings of training images, which results in a degradation of accuracy.

Self-Supervised Learning

EgoPet: Egomotion and Interaction Data from an Animal's Perspective

no code implementations15 Apr 2024 Amir Bar, Arya Bakhtiar, Danny Tran, Antonio Loquercio, Jathushan Rajasegaran, Yann Lecun, Amir Globerson, Trevor Darrell

Animals perceive the world to plan their actions and interact with other agents to accomplish complex tasks, demonstrating capabilities that are still unmatched by AI systems.

Learning and Leveraging World Models in Visual Representation Learning

no code implementations1 Mar 2024 Quentin Garrido, Mahmoud Assran, Nicolas Ballas, Adrien Bardes, Laurent Najman, Yann Lecun

Joint-Embedding Predictive Architecture (JEPA) has emerged as a promising self-supervised approach that learns by leveraging a world model.

Representation Learning

Learning by Reconstruction Produces Uninformative Features For Perception

no code implementations17 Feb 2024 Randall Balestriero, Yann Lecun

Despite interpretability of the reconstruction and generation, we identify a misalignment between learning by reconstruction, and learning for perception.

Denoising Representation Learning

Revisiting Feature Prediction for Learning Visual Representations from Video

1 code implementation arXiv preprint 2024 Adrien Bardes, Quentin Garrido, Jean Ponce, Xinlei Chen, Michael Rabbat, Yann Lecun, Mahmoud Assran, Nicolas Ballas

This paper explores feature prediction as a stand-alone objective for unsupervised learning from video and introduces V-JEPA, a collection of vision models trained solely using a feature prediction objective, without the use of pretrained image encoders, text, negative examples, reconstruction, or other sources of supervision.

G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering

1 code implementation12 Feb 2024 Xiaoxin He, Yijun Tian, Yifei Sun, Nitesh V. Chawla, Thomas Laurent, Yann Lecun, Xavier Bresson, Bryan Hooi

Given a graph with textual attributes, we enable users to `chat with their graph': that is, to ask questions about the graph using a conversational interface.

Common Sense Reasoning Graph Classification +5

Fast and Exact Enumeration of Deep Networks Partitions Regions

no code implementations20 Jan 2024 Randall Balestriero, Yann Lecun

One fruitful formulation of Deep Networks (DNs) enabling their theoretical study and providing practical guidelines to practitioners relies on Piecewise Affine Splines.

Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs

1 code implementation CVPR 2024 Shengbang Tong, Zhuang Liu, Yuexiang Zhai, Yi Ma, Yann Lecun, Saining Xie

To understand the roots of these errors, we explore the gap between the visual embedding space of CLIP and vision-only self-supervised learning.

Representation Learning Self-Supervised Learning +1

Gradient-based Planning with World Models

no code implementations28 Dec 2023 Jyothir S V, Siddhartha Jalagam, Yann Lecun, Vlad Sobal

The enduring challenge in the field of artificial intelligence has been the control of systems to achieve desired behaviours.

Model Predictive Control

GAIA: a benchmark for General AI Assistants

1 code implementation21 Nov 2023 Grégoire Mialon, Clémentine Fourrier, Craig Swift, Thomas Wolf, Yann Lecun, Thomas Scialom

GAIA's philosophy departs from the current trend in AI benchmarks suggesting to target tasks that are ever more difficult for humans.

Philosophy

URLOST: Unsupervised Representation Learning without Stationarity or Topology

no code implementations6 Oct 2023 Zeyu Yun, Juexiao Zhang, Bruno Olshausen, Yann Lecun, Yubei Chen

Unsupervised representation learning has seen tremendous progress but is constrained by its reliance on data modality-specific stationarity and topology, a limitation not found in biological intelligence systems.

Representation Learning

MC-JEPA: A Joint-Embedding Predictive Architecture for Self-Supervised Learning of Motion and Content Features

no code implementations24 Jul 2023 Adrien Bardes, Jean Ponce, Yann Lecun

Self-supervised learning of visual representations has been focusing on learning content features, which do not capture object motion or location, and focus on identifying and differentiating objects in images and videos.

Optical Flow Estimation Self-Supervised Learning +1

Self-Supervised Learning with Lie Symmetries for Partial Differential Equations

1 code implementation NeurIPS 2023 Grégoire Mialon, Quentin Garrido, Hannah Lawrence, Danyal Rehman, Yann Lecun, Bobak T. Kiani

Machine learning for differential equations paves the way for computationally efficient alternatives to numerical solvers, with potentially broad impacts in science and engineering.

Representation Learning Self-Supervised Learning

Variance-Covariance Regularization Improves Representation Learning

no code implementations23 Jun 2023 Jiachen Zhu, Katrina Evtimova, Yubei Chen, Ravid Shwartz-Ziv, Yann Lecun

In summary, VCReg offers a universally applicable regularization framework that significantly advances transfer learning and highlights the connection between gradient starvation, neural collapse, and feature transferability.

Long-tail Learning Representation Learning +2

Introduction to Latent Variable Energy-Based Models: A Path Towards Autonomous Machine Intelligence

no code implementations5 Jun 2023 Anna Dawid, Yann Lecun

Current automated systems have crucial limitations that need to be addressed before artificial intelligence can reach human-like levels and bring new technological revolutions.

Self-Driving Cars

Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning

3 code implementations31 May 2023 Xiaoxin He, Xavier Bresson, Thomas Laurent, Adam Perold, Yann Lecun, Bryan Hooi

With the advent of powerful large language models (LLMs) such as GPT or Llama2, which demonstrate an ability to reason and to utilize general knowledge, there is a growing need for techniques which combine the textual modelling abilities of LLMs with the structural learning capabilities of GNNs.

Ranked #2 on Node Property Prediction on ogbn-arxiv (using extra training data)

Decision Making General Knowledge +5

Reverse Engineering Self-Supervised Learning

1 code implementation NeurIPS 2023 Ido Ben-Shaul, Ravid Shwartz-Ziv, Tomer Galanti, Shai Dekel, Yann Lecun

Self-supervised learning (SSL) is a powerful tool in machine learning, but understanding the learned representations and their underlying mechanisms remains a challenge.

Clustering Representation Learning +1

To Compress or Not to Compress- Self-Supervised Learning and Information Theory: A Review

no code implementations19 Apr 2023 Ravid Shwartz-Ziv, Yann Lecun

Information theory, and notably the information bottleneck principle, has been pivotal in shaping deep neural networks.

Self-Supervised Learning

EMP-SSL: Towards Self-Supervised Learning in One Training Epoch

3 code implementations8 Apr 2023 Shengbang Tong, Yubei Chen, Yi Ma, Yann Lecun

Recently, self-supervised learning (SSL) has achieved tremendous success in learning image representation.

Quantization Self-Supervised Learning

Active Self-Supervised Learning: A Few Low-Cost Relationships Are All You Need

no code implementations ICCV 2023 Vivien Cabannes, Leon Bottou, Yann Lecun, Randall Balestriero

Third, it provides a proper active learning framework yielding low-cost solutions to annotate datasets, arguably bringing the gap between theory and practice of active learning that is based on simple-to-answer-by-non-experts queries of semantic relationships between inputs.

Active Learning Self-Supervised Learning

An Information-Theoretic Perspective on Variance-Invariance-Covariance Regularization

no code implementations1 Mar 2023 Ravid Shwartz-Ziv, Randall Balestriero, Kenji Kawaguchi, Tim G. J. Rudner, Yann Lecun

Variance-Invariance-Covariance Regularization (VICReg) is a self-supervised learning (SSL) method that has shown promising results on a variety of tasks.

Self-Supervised Learning Transfer Learning

Self-supervised learning of Split Invariant Equivariant representations

1 code implementation14 Feb 2023 Quentin Garrido, Laurent Najman, Yann Lecun

We hope that both our introduced dataset and approach will enable learning richer representations without supervision in more complex scenarios.

Self-Supervised Learning

The SSL Interplay: Augmentations, Inductive Bias, and Generalization

no code implementations6 Feb 2023 Vivien Cabannes, Bobak T. Kiani, Randall Balestriero, Yann Lecun, Alberto Bietti

Self-supervised learning (SSL) has emerged as a powerful framework to learn representations from raw data without supervision.

Data Augmentation Inductive Bias +1

A Generalization of ViT/MLP-Mixer to Graphs

3 code implementations27 Dec 2022 Xiaoxin He, Bryan Hooi, Thomas Laurent, Adam Perold, Yann Lecun, Xavier Bresson

First, they capture long-range dependency and mitigate the issue of over-squashing as demonstrated on Long Range Graph Benchmark and TreeNeighbourMatch datasets.

Graph Classification Graph Regression +1

Joint Embedding Predictive Architectures Focus on Slow Features

1 code implementation20 Nov 2022 Vlad Sobal, Jyothir S V, Siddhartha Jalagam, Nicolas Carion, Kyunghyun Cho, Yann Lecun

Many common methods for learning a world model for pixel-based environments use generative architectures trained with pixel-level reconstruction objectives.

POLICE: Provably Optimal Linear Constraint Enforcement for Deep Neural Networks

1 code implementation2 Nov 2022 Randall Balestriero, Yann Lecun

In this paper we propose the first provable affine constraint enforcement method for DNNs that only requires minimal changes into a given DNN's forward-pass, that is computationally friendly, and that leaves the optimization of the DNN's parameter to be unconstrained, i. e. standard gradient-based method can be employed.

Unsupervised Learning of Structured Representations via Closed-Loop Transcription

1 code implementation30 Oct 2022 Shengbang Tong, Xili Dai, Yubei Chen, Mingyang Li, Zengyi Li, Brent Yi, Yann Lecun, Yi Ma

This paper proposes an unsupervised method for learning a unified representation that serves both discriminative and generative purposes.

VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment

1 code implementation9 Oct 2022 Shraman Pramanick, Li Jing, Sayan Nag, Jiachen Zhu, Hardik Shah, Yann Lecun, Rama Chellappa

Extensive experiments on a wide range of vision- and vision-language downstream tasks demonstrate the effectiveness of VoLTA on fine-grained applications without compromising the coarse-grained downstream performance, often outperforming methods using significantly more caption and box annotations.

object-detection Object Detection +2

RankMe: Assessing the downstream performance of pretrained self-supervised representations by their rank

no code implementations5 Oct 2022 Quentin Garrido, Randall Balestriero, Laurent Najman, Yann Lecun

Joint-Embedding Self Supervised Learning (JE-SSL) has seen a rapid development, with the emergence of many method variations but only few principled guidelines that would help practitioners to successfully deploy them.

Self-Supervised Learning

VICRegL: Self-Supervised Learning of Local Visual Features

3 code implementations4 Oct 2022 Adrien Bardes, Jean Ponce, Yann Lecun

Most recent self-supervised methods for learning image representations focus on either producing a global feature with invariance properties, or producing a set of local features.

Segmentation Self-Supervised Learning

Minimalistic Unsupervised Learning with the Sparse Manifold Transform

no code implementations30 Sep 2022 Yubei Chen, Zeyu Yun, Yi Ma, Bruno Olshausen, Yann Lecun

Though there remains a small performance gap between our simple constructive model and SOTA methods, the evidence points to this as a promising direction for achieving a principled and white-box approach to unsupervised learning.

Self-Supervised Learning Sparse Representation-based Classification +3

Variance Covariance Regularization Enforces Pairwise Independence in Self-Supervised Representations

no code implementations29 Sep 2022 Grégoire Mialon, Randall Balestriero, Yann Lecun

Self-Supervised Learning (SSL) methods such as VICReg, Barlow Twins or W-MSE avoid collapse of their joint embedding architectures by constraining or regularizing the covariance matrix of their projector's output.

Domain Generalization Self-Supervised Learning

Joint Embedding Self-Supervised Learning in the Kernel Regime

no code implementations29 Sep 2022 Bobak T. Kiani, Randall Balestriero, Yubei Chen, Seth Lloyd, Yann Lecun

The fundamental goal of self-supervised learning (SSL) is to produce useful representations of data without access to any labels for classifying the data.

Self-Supervised Learning

Light-weight probing of unsupervised representations for Reinforcement Learning

1 code implementation25 Aug 2022 Wancong Zhang, Anthony GX-Chen, Vlad Sobal, Yann Lecun, Nicolas Carion

Unsupervised visual representation learning offers the opportunity to leverage large corpora of unlabeled trajectories to form useful visual representations, which can benefit the training of reinforcement learning (RL) algorithms.

reinforcement-learning Reinforcement Learning +3

What Do We Maximize in Self-Supervised Learning?

no code implementations20 Jul 2022 Ravid Shwartz-Ziv, Randall Balestriero, Yann Lecun

In this paper, we examine self-supervised learning methods, particularly VICReg, to provide an information-theoretical understanding of their construction.

Self-Supervised Learning Transfer Learning

TiCo: Transformation Invariance and Covariance Contrast for Self-Supervised Visual Representation Learning

2 code implementations21 Jun 2022 Jiachen Zhu, Rafael M. Moraes, Serkan Karakulak, Vlad Sobol, Alfredo Canziani, Yann Lecun

Similar to other recent self-supervised learning methods, our method is based on maximizing the agreement among embeddings of different distorted versions of the same image, which pushes the encoder to produce transformation invariant representations.

Representation Learning Self-Supervised Learning

Masked Siamese ConvNets

no code implementations15 Jun 2022 Li Jing, Jiachen Zhu, Yann Lecun

Self-supervised learning has shown superior performances over supervised methods on various vision benchmarks.

Image Classification Inductive Bias +4

On the duality between contrastive and non-contrastive self-supervised learning

no code implementations3 Jun 2022 Quentin Garrido, Yubei Chen, Adrien Bardes, Laurent Najman, Yann Lecun

Recent approaches in self-supervised learning of image representations can be categorized into different families of methods and, in particular, can be divided into contrastive and non-contrastive approaches.

Self-Supervised Learning

Contrastive and Non-Contrastive Self-Supervised Learning Recover Global and Local Spectral Embedding Methods

no code implementations23 May 2022 Randall Balestriero, Yann Lecun

Self-Supervised Learning (SSL) surmises that inputs and pairwise positive relationships are enough to learn meaningful representations.

Self-Supervised Learning

Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors

1 code implementation20 May 2022 Ravid Shwartz-Ziv, Micah Goldblum, Hossein Souri, Sanyam Kapoor, Chen Zhu, Yann Lecun, Andrew Gordon Wilson

Deep learning is increasingly moving towards a transfer learning paradigm whereby large foundation models are fine-tuned on downstream tasks, starting from an initialization learned on the source task.

Deep Learning Transfer Learning

The Effects of Regularization and Data Augmentation are Class Dependent

no code implementations7 Apr 2022 Randall Balestriero, Leon Bottou, Yann Lecun

The optimal amount of DA or weight decay found from cross-validation leads to disastrous model performances on some classes e. g. on Imagenet with a resnet50, the "barn spider" classification test accuracy falls from $68\%$ to $46\%$ only by introducing random crop DA during training.

Data Augmentation

projUNN: efficient method for training deep networks with unitary matrices

1 code implementation10 Mar 2022 Bobak Kiani, Randall Balestriero, Yann Lecun, Seth Lloyd

In learning with recurrent or very deep feed-forward networks, employing unitary matrices in each layer can be very effective at maintaining long-range stability.

A Data-Augmentation Is Worth A Thousand Samples: Exact Quantification From Analytical Augmented Sample Moments

no code implementations16 Feb 2022 Randall Balestriero, Ishan Misra, Yann Lecun

We show that for a training loss to be stable under DA sampling, the model's saliency map (gradient of the loss with respect to the model's input) must align with the smallest eigenvector of the sample variance under the considered DA augmentation, hinting at a possible explanation on why models tend to shift their focus from edges to textures.

Data Augmentation

Neural Manifold Clustering and Embedding

1 code implementation24 Jan 2022 Zengyi Li, Yubei Chen, Yann Lecun, Friedrich T. Sommer

We argue that achieving manifold clustering with neural networks requires two essential ingredients: a domain-specific constraint that ensures the identification of the manifolds, and a learning algorithm for embedding each manifold to a linear subspace in the feature space.

Clustering Data Augmentation +2

Sparse Coding with Multi-Layer Decoders using Variance Regularization

1 code implementation16 Dec 2021 Katrina Evtimova, Yann Lecun

Sparse coding with an $l_1$ penalty and a learned linear dictionary requires regularization of the dictionary to prevent a collapse in the $l_1$ norms of the codes.

Decoder Denoising

Learning in High Dimension Always Amounts to Extrapolation

no code implementations18 Oct 2021 Randall Balestriero, Jerome Pesenti, Yann Lecun

The notion of interpolation and extrapolation is fundamental in various fields from deep learning to function approximation.

Vocal Bursts Intensity Prediction

Understanding Dimensional Collapse in Contrastive Self-supervised Learning

1 code implementation ICLR 2022 Li Jing, Pascal Vincent, Yann Lecun, Yuandong Tian

It has been shown that non-contrastive methods suffer from a lesser collapse problem of a different nature: dimensional collapse, whereby the embedding vectors end up spanning a lower-dimensional subspace instead of the entire available embedding space.

Contrastive Learning Learning Theory +2

Decoupled Contrastive Learning

4 code implementations13 Oct 2021 Chun-Hsiao Yeh, Cheng-Yao Hong, Yen-Chi Hsu, Tyng-Luh Liu, Yubei Chen, Yann Lecun

Further, DCL can be combined with the SOTA contrastive learning method, NNCLR, to achieve 72. 3% ImageNet-1K top-1 accuracy with 512 batch size in 400 epochs, which represents a new SOTA in contrastive learning.

Contrastive Learning Self-Supervised Learning

Compact and Optimal Deep Learning with Recurrent Parameter Generators

1 code implementation15 Jul 2021 Jiayun Wang, Yubei Chen, Stella X. Yu, Brian Cheung, Yann Lecun

We propose a drastically different approach to compact and optimal deep learning: We decouple the Degrees of freedom (DoF) and the actual number of parameters of a model, optimize a small DoF with predefined random linear constraints for a large model of arbitrary architecture, in one-stage end-to-end learning.

Ranked #97 on Image Classification on ObjectNet (using extra training data)

Deep Learning Image Classification +1

Barlow Twins: Self-Supervised Learning via Redundancy Reduction

24 code implementations4 Mar 2021 Jure Zbontar, Li Jing, Ishan Misra, Yann Lecun, Stéphane Deny

This causes the embedding vectors of distorted versions of a sample to be similar, while minimizing the redundancy between the components of these vectors.

General Classification Object Detection +3

Neural Potts Model

no code implementations1 Jan 2021 Tom Sercu, Robert Verkuil, Joshua Meier, Brandon Amos, Zeming Lin, Caroline Chen, Jason Liu, Yann Lecun, Alexander Rives

We propose the Neural Potts Model objective as an amortized optimization problem.

Implicit Rank-Minimizing Autoencoder

3 code implementations NeurIPS 2020 Li Jing, Jure Zbontar, Yann Lecun

An important component of autoencoders is the method by which the information capacity of the latent representation is minimized or limited.

Decoder Image Generation +2

Inspirational Adversarial Image Generation

1 code implementation17 Jun 2019 Baptiste Rozière, Morgane Riviere, Olivier Teytaud, Jérémy Rapin, Yann Lecun, Camille Couprie

We design a simple optimization method to find the optimal latent parameters corresponding to the closest generation to any input inspirational image.

Image Generation

The role of over-parametrization in generalization of neural networks

1 code implementation ICLR 2019 Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann Lecun, Nathan Srebro

Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization.

Learning about an exponential amount of conditional distributions

1 code implementation NeurIPS 2019 Mohamed Ishmael Belghazi, Maxime Oquab, Yann Lecun, David Lopez-Paz

We introduce the Neural Conditioner (NC), a self-supervised machine able to learn about all the conditional distributions of a random vector $X$.

General Classification

Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic

1 code implementation ICLR 2019 Mikael Henaff, Alfredo Canziani, Yann Lecun

Learning a policy using only observational data is challenging because the distribution of states it induces at execution time may differ from the distribution observed during training.

Rolling Shutter Correction

A Spectral Regularizer for Unsupervised Disentanglement

no code implementations4 Dec 2018 Aditya Ramesh, Youngduck Choi, Yann Lecun

A generative model with a disentangled representation allows for independent control over different aspects of the output.

Disentanglement

GLoMo: Unsupervised Learning of Transferable Relational Graphs

no code implementations NeurIPS 2018 Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan R. Salakhutdinov, Yann Lecun

We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden units), or embedding-free units such as image pixels.

Image Classification Natural Language Inference +4

Adversarially-Trained Normalized Noisy-Feature Auto-Encoder for Text Generation

no code implementations10 Nov 2018 Xiang Zhang, Yann Lecun

An ATNNFAE consists of an auto-encoder where the internal code is normalized on the unit sphere and corrupted by additive noise.

Decoder Text Generation

Learning with Reflective Likelihoods

no code implementations27 Sep 2018 Adji B. Dieng, Kyunghyun Cho, David M. Blei, Yann Lecun

Furthermore, the reflective likelihood objective prevents posterior collapse when used to train stochastic auto-encoders with amortized inference.

Attribute

Comparing Dynamics: Deep Neural Networks versus Glassy Systems

no code implementations ICML 2018 Marco Baity-Jesi, Levent Sagun, Mario Geiger, Stefano Spigler, Gerard Ben Arous, Chiara Cammarota, Yann Lecun, Matthieu Wyart, Giulio Biroli

We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems.

GLoMo: Unsupervisedly Learned Relational Graphs as Transferable Representations

1 code implementation14 Jun 2018 Zhilin Yang, Jake Zhao, Bhuwan Dhingra, Kaiming He, William W. Cohen, Ruslan Salakhutdinov, Yann Lecun

We also show that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have not been trained (including GloVe embeddings, ELMo embeddings, and task-specific RNN hidden unit), or embedding-free units such as image pixels.

Image Classification Natural Language Inference +4

Backpropagation for Implicit Spectral Densities

1 code implementation1 Jun 2018 Aditya Ramesh, Yann Lecun

We introduce a tool that allows us to do this even when the likelihood is not explicitly set, by instead using the implicit likelihood of the model.

Towards Understanding the Role of Over-Parametrization in Generalization of Neural Networks

2 code implementations30 May 2018 Behnam Neyshabur, Zhiyuan Li, Srinadh Bhojanapalli, Yann Lecun, Nathan Srebro

Despite existing work on ensuring generalization of neural networks in terms of scale sensitive complexity measures, such as norms, margin and sharpness, these complexity measures do not offer an explanation of why neural networks generalize better with over-parametrization.

DeSIGN: Design Inspiration from Generative Networks

1 code implementation3 Apr 2018 Othman Sbai, Mohamed Elhoseiny, Antoine Bordes, Yann Lecun, Camille Couprie

Can an algorithm create original and compelling fashion designs to serve as an inspirational assistant?

Image Generation Retrieval

Byte-Level Recursive Convolutional Auto-Encoder for Text

1 code implementation ICLR 2018 Xiang Zhang, Yann Lecun

The proposed model is a multi-stage deep convolutional encoder-decoder framework using residual connections, containing up to 160 parameterized layers.

Decoder Text Generation

Prediction Under Uncertainty with Error Encoding Networks

no code implementations ICLR 2018 Mikael Henaff, Junbo Zhao, Yann Lecun

In this work we introduce a new framework for performing temporal predictions in the presence of uncertainty.

Video Prediction

Prediction Under Uncertainty with Error-Encoding Networks

2 code implementations14 Nov 2017 Mikael Henaff, Junbo Zhao, Yann Lecun

In this work we introduce a new framework for performing temporal predictions in the presence of uncertainty.

Video Prediction

A hierarchical loss and its problems when classifying non-hierarchically

no code implementations1 Sep 2017 Cinna Wu, Mark Tygert, Yann Lecun

We define a metric that, inter alia, can penalize failure to distinguish between a sheepdog and a skyscraper more than failure to distinguish between a sheepdog and a poodle.

General Classification

Which Encoding is the Best for Text Classification in Chinese, English, Japanese and Korean?

3 code implementations8 Aug 2017 Xiang Zhang, Yann Lecun

This article offers an empirical study on the different ways of encoding Chinese, Japanese, Korean (CJK) and English languages for text classification.

General Classification Text Classification

Adversarially Regularized Autoencoders

6 code implementations13 Jun 2017 Jake Zhao, Yoon Kim, Kelly Zhang, Alexander M. Rush, Yann Lecun

This adversarially regularized autoencoder (ARAE) allows us to generate natural textual outputs as well as perform manipulations in the latent space to induce change in the output space.

Representation Learning Style Transfer

Model-Based Planning with Discrete and Continuous Actions

1 code implementation19 May 2017 Mikael Henaff, William F. Whitney, Yann Lecun

Action planning using learned and differentiable forward models of the world is a general approach which has a number of desirable properties, including improved sample complexity over model-free RL methods, reuse of learned models across different tasks, and the ability to perform efficient gradient-based optimization in continuous action spaces.

Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs

4 code implementations ICML 2017 Li Jing, Yichen Shen, Tena Dubček, John Peurifoy, Scott Skirlo, Yann Lecun, Max Tegmark, Marin Soljačić

Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data.

Permuted-MNIST

Tracking the World State with Recurrent Entity Networks

5 code implementations12 Dec 2016 Mikael Henaff, Jason Weston, Arthur Szlam, Antoine Bordes, Yann Lecun

The EntNet sets a new state-of-the-art on the bAbI tasks, and is the first method to solve all the tasks in the 10k training examples setting.

Procedural Text Understanding Question Answering

Disentangling factors of variation in deep representation using adversarial training

no code implementations NeurIPS 2016 Michael F. Mathieu, Junbo Jake Zhao, Junbo Zhao, Aditya Ramesh, Pablo Sprechmann, Yann Lecun

The only available source of supervision during the training process comes from our ability to distinguish among different observations belonging to the same category.

Geometric deep learning: going beyond Euclidean data

no code implementations24 Nov 2016 Michael M. Bronstein, Joan Bruna, Yann Lecun, Arthur Szlam, Pierre Vandergheynst

In many applications, such geometric data are large and complex (in the case of social networks, on the scale of billions), and are natural targets for machine learning techniques.

Deep Learning

Eigenvalues of the Hessian in Deep Learning: Singularity and Beyond

no code implementations22 Nov 2016 Levent Sagun, Leon Bottou, Yann Lecun

We look at the eigenvalues of the Hessian of a loss function before and after training.

Deep Learning

Disentangling factors of variation in deep representations using adversarial training

3 code implementations10 Nov 2016 Michael Mathieu, Junbo Zhao, Pablo Sprechmann, Aditya Ramesh, Yann Lecun

During training, the only available source of supervision comes from our ability to distinguish among different observations belonging to the same class.

Disentanglement

Entropy-SGD: Biasing Gradient Descent Into Wide Valleys

2 code implementations6 Nov 2016 Pratik Chaudhari, Anna Choromanska, Stefano Soatto, Yann Lecun, Carlo Baldassi, Christian Borgs, Jennifer Chayes, Levent Sagun, Riccardo Zecchina

This paper proposes a new optimization algorithm called Entropy-SGD for training deep neural networks that is motivated by the local geometry of the energy landscape.

Energy-based Generative Adversarial Network

3 code implementations11 Sep 2016 Junbo Zhao, Michael Mathieu, Yann Lecun

We introduce the "Energy-based Generative Adversarial Network" model (EBGAN) which views the discriminator as an energy function that attributes low energies to the regions near the data manifold and higher energies to other regions.

Generative Adversarial Network

Very Deep Convolutional Networks for Text Classification

23 code implementations EACL 2017 Alexis Conneau, Holger Schwenk, Loïc Barrault, Yann Lecun

The dominant approach for many NLP tasks are recurrent neural networks, in particular LSTMs, and convolutional neural networks.

General Classification Text Classification

What is the Best Feature Learning Procedure in Hierarchical Recognition Architectures?

no code implementations5 Jun 2016 Kevin Jarrett, Koray Kvukcuoglu, Karol Gregor, Yann Lecun

We also introduce a new single phase supervised learning procedure that places an L1 penalty on the output state of each layer of the network.

Object Recognition Unsupervised Pre-training

Recurrent Orthogonal Networks and Long-Memory Tasks

1 code implementation22 Feb 2016 Mikael Henaff, Arthur Szlam, Yann Lecun

Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research.

Universal halting times in optimization and machine learning

no code implementations19 Nov 2015 Levent Sagun, Thomas Trogdon, Yann Lecun

Given an algorithm, which we take to be both the optimization routine and the form of the random landscape, the fluctuations of the halting time follow a distribution that, after centering and scaling, remains unchanged even when the distribution on the landscape is changed.

BIG-bench Machine Learning

Super-Resolution with Deep Convolutional Sufficient Statistics

1 code implementation18 Nov 2015 Joan Bruna, Pablo Sprechmann, Yann Lecun

Inverse problems in image and audio, and super-resolution in particular, can be seen as high-dimensional structured prediction problems, where the goal is to characterize the conditional distribution of a high-resolution output given its low-resolution corrupted observation.

Bandwidth Extension Image Super-Resolution +1

Deep multi-scale video prediction beyond mean square error

5 code implementations17 Nov 2015 Michael Mathieu, Camille Couprie, Yann Lecun

Learning to predict future images from a video sequence involves the construction of an internal representation that models the image evolution accurately, and therefore, to some degree, its content and dynamics.

Optical Flow Estimation Video Prediction

Binary embeddings with structured hashed projections

no code implementations16 Nov 2015 Anna Choromanska, Krzysztof Choromanski, Mariusz Bojarski, Tony Jebara, Sanjiv Kumar, Yann Lecun

We prove several theoretical results showing that projections via various structured matrices followed by nonlinear mappings accurately preserve the angular distance between input high-dimensional vectors.

LEMMA

Universum Prescription: Regularization using Unlabeled Data

no code implementations11 Nov 2015 Xiang Zhang, Yann Lecun

This paper shows that simply prescribing "none of the above" labels to unlabeled data has a beneficial regularization effect to supervised learning.

Image Classification

Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches

2 code implementations20 Oct 2015 Jure Žbontar, Yann Lecun

We approach the problem by learning a similarity measure on small image patches using a convolutional neural network.

Binary Classification Stereo Matching +1

Very Deep Multilingual Convolutional Neural Networks for LVCSR

no code implementations29 Sep 2015 Tom Sercu, Christian Puhrsch, Brian Kingsbury, Yann Lecun

However, CNNs in LVCSR have not kept pace with recent advances in other domains where deeper neural networks provide superior performance.

speech-recognition Speech Recognition

Deep Convolutional Networks on Graph-Structured Data

3 code implementations16 Jun 2015 Mikael Henaff, Joan Bruna, Yann Lecun

Deep Learning's recent successes have mostly relied on Convolutional Networks, which exploit fundamental statistical properties of images, sounds and video data: the local stationarity and multi-scale compositional structure, that allows expressing long range interactions in terms of shorter, localized interactions.

General Classification

Learning to Linearize Under Uncertainty

no code implementations NeurIPS 2015 Ross Goroshin, Michael Mathieu, Yann Lecun

Training deep feature hierarchies to solve supervised learning tasks has achieved state of the art performance on many problems in computer vision.

Stacked What-Where Auto-encoders

2 code implementations8 Jun 2015 Junbo Zhao, Michael Mathieu, Ross Goroshin, Yann Lecun

The objective function includes reconstruction terms that induce the hidden states in the Deconvnet to be similar to those of the Convnet.

Decoder Semi-Supervised Image Classification

A mathematical motivation for complex-valued convolutional networks

no code implementations11 Mar 2015 Joan Bruna, Soumith Chintala, Yann Lecun, Serkan Piantino, Arthur Szlam, Mark Tygert

Courtesy of the exact correspondence, the remarkably rich and rigorous body of mathematical analysis for wavelets applies directly to (complex-valued) convnets.

Text Understanding from Scratch

3 code implementations5 Feb 2015 Xiang Zhang, Yann Lecun

This article demontrates that we can apply deep learning to text understanding from character-level inputs all the way up to abstract text concepts, using temporal convolutional networks (ConvNets).

General Classification Sentiment Analysis

Fast Convolutional Nets With fbfft: A GPU Performance Evaluation

2 code implementations24 Dec 2014 Nicolas Vasilache, Jeff Johnson, Michael Mathieu, Soumith Chintala, Serkan Piantino, Yann Lecun

We examine the performance profile of Convolutional Neural Network training on the current generation of NVIDIA Graphics Processing Units.

Audio Source Separation with Discriminative Scattering Networks

no code implementations22 Dec 2014 Pablo Sprechmann, Joan Bruna, Yann Lecun

In this report we describe an ongoing line of research for solving single-channel source separation problems.

Audio Source Separation

Explorations on high dimensional landscapes

no code implementations20 Dec 2014 Levent Sagun, V. Ugur Guney, Gerard Ben Arous, Yann Lecun

Finding minima of a real valued non-convex function over a high dimensional space is a major challenge in science.

Vocal Bursts Intensity Prediction

Deep learning with Elastic Averaging SGD

10 code implementations NeurIPS 2015 Sixin Zhang, Anna Choromanska, Yann Lecun

We empirically demonstrate that in the deep learning setting, due to the existence of many local optima, allowing more exploration can lead to the improved performance.

Deep Learning Image Classification +1

The Loss Surfaces of Multilayer Networks

1 code implementation30 Nov 2014 Anna Choromanska, Mikael Henaff, Michael Mathieu, Gérard Ben Arous, Yann Lecun

We show that for large-size decoupled networks the lowest critical values of the random loss function form a layered structure and they are located in a well-defined band lower-bounded by the global minimum.

Differentially- and non-differentially-private random decision trees

no code implementations26 Oct 2014 Mariusz Bojarski, Anna Choromanska, Krzysztof Choromanski, Yann Lecun

We consider supervised learning with random decision trees, where the tree construction is completely random.

MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation

no code implementations28 Sep 2014 Arjun Jain, Jonathan Tompson, Yann Lecun, Christoph Bregler

In this work, we propose a novel and efficient method for articulated human pose estimation in videos using a convolutional network architecture, which incorporates both color and motion features.

2D Human Pose Estimation Pose Estimation

Fast Approximation of Rotations and Hessians matrices

no code implementations29 Apr 2014 Michael Mathieu, Yann Lecun

A new method to represent and approximate rotation matrices is introduced.

Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation

no code implementations NeurIPS 2014 Emily Denton, Wojciech Zaremba, Joan Bruna, Yann Lecun, Rob Fergus

We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks.

Object Recognition

OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

4 code implementations21 Dec 2013 Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fergus, Yann Lecun

This integrated framework is the winner of the localization task of the ImageNet Large Scale Visual Recognition Challenge 2013 (ILSVRC2013) and obtained very competitive results for the detection and classifications tasks.

General Classification Image Classification +2

Spectral Networks and Locally Connected Networks on Graphs

4 code implementations21 Dec 2013 Joan Bruna, Wojciech Zaremba, Arthur Szlam, Yann Lecun

Convolutional Neural Networks are extremely efficient architectures in image and audio recognition tasks, thanks to their ability to exploit the local translational invariance of signal classes over their domain.

Clustering Translation

Fast Training of Convolutional Networks through FFTs

no code implementations20 Dec 2013 Michael Mathieu, Mikael Henaff, Yann Lecun

Convolutional networks are one of the most widely employed architectures in computer vision and machine learning.

Understanding Deep Architectures using a Recursive Convolutional Network

no code implementations6 Dec 2013 David Eigen, Jason Rolfe, Rob Fergus, Yann Lecun

A key challenge in designing convolutional network models is sizing them appropriately.

Signal Recovery from Pooling Representations

no code implementations16 Nov 2013 Joan Bruna, Arthur Szlam, Yann Lecun

In this work we compute lower Lipschitz bounds of $\ell_p$ pooling operators for $p=1, 2, \infty$ as well as $\ell_p$ pooling operators preceded by half-rectification layers.

regression

Adaptive learning rates and parallelization for stochastic, sparse, non-smooth gradients

no code implementations16 Jan 2013 Tom Schaul, Yann Lecun

Recent work has established an empirically successful framework for adapting learning rates for stochastic gradient descent (SGD).

No More Pesky Learning Rates

no code implementations6 Jun 2012 Tom Schaul, Sixin Zhang, Yann Lecun

The performance of stochastic gradient descent (SGD) depends critically on how learning rates are tuned and decreased over time.

Cannot find the paper you are looking for? You can Submit a new open access paper.