Search Results for author: Pascal Vincent

Found 64 papers, 36 papers with code

Compositional Risk Minimization

no code implementations8 Oct 2024 Divyat Mahajan, Mohammad Pezeshki, Ioannis Mitliagkas, Kartik Ahuja, Pascal Vincent

In this work, we tackle a challenging and extreme form of subpopulation shift, which is termed compositional shift.

Attribute

WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models

1 code implementation27 Nov 2023 Youssef Benchekroun, Megi Dervishi, Mark Ibrahim, Jean-Baptiste Gaya, Xavier Martinet, Grégoire Mialon, Thomas Scialom, Emmanuel Dupoux, Dieuwke Hupkes, Pascal Vincent

We propose WorldSense, a benchmark designed to assess the extent to which LLMs are consistently able to sustain tacit world models, by testing how they draw simple inferences from descriptions of simple arrangements of entities.

In-Context Learning

Discovering environments with XRM

1 code implementation28 Sep 2023 Mohammad Pezeshki, Diane Bouchacourt, Mark Ibrahim, Nicolas Ballas, Pascal Vincent, David Lopez-Paz

Environment annotations are essential for the success of many out-of-distribution (OOD) generalization methods.

Domain Generalization Out-of-Distribution Generalization

PUG: Photorealistic and Semantically Controllable Synthetic Data for Representation Learning

1 code implementation NeurIPS 2023 Florian Bordes, Shashank Shekhar, Mark Ibrahim, Diane Bouchacourt, Pascal Vincent, Ari S. Morcos

Synthetic image datasets offer unmatched advantages for designing and evaluating deep neural networks: they make it possible to (i) render as many data samples as needed, (ii) precisely control each scene and yield granular ground truth labels (and captions), (iii) precisely control distribution shifts between training and testing to isolate variables of interest for sound experimentation.

Representation Learning

On the Identifiability of Quantized Factors

1 code implementation28 Jun 2023 Vitória Barin-Pacela, Kartik Ahuja, Simon Lacoste-Julien, Pascal Vincent

We introduce this novel form of identifiability, termed quantized factor identifiability, and provide a comprehensive proof of the recovery of the quantized factors.

Disentanglement Inductive Bias

Do SSL Models Have Déjà Vu? A Case of Unintended Memorization in Self-supervised Learning

1 code implementation NeurIPS 2023 Casey Meehan, Florian Bordes, Pascal Vincent, Kamalika Chaudhuri, Chuan Guo

Self-supervised learning (SSL) algorithms can produce useful image representations by learning to associate different parts of natural images with one another.

Memorization Self-Supervised Learning

Objectives Matter: Understanding the Impact of Self-Supervised Objectives on Vision Transformer Representations

no code implementations25 Apr 2023 Shashank Shekhar, Florian Bordes, Pascal Vincent, Ari Morcos

Here, we aim to explain these differences by analyzing the impact of these objectives on the structure and transferability of the learned representations.

Self-Supervised Learning Specificity

Instance-Conditioned GAN Data Augmentation for Representation Learning

no code implementations16 Mar 2023 Pietro Astolfi, Arantxa Casanova, Jakob Verbeek, Pascal Vincent, Adriana Romero-Soriano, Michal Drozdzal

We showcase the benefits of DA_IC-GAN by plugging it out-of-the-box into the supervised training of ResNets and DeiT models on the ImageNet dataset, and achieving accuracy boosts up to between 1%p and 2%p with the highest capacity models.

Data Augmentation Few-Shot Learning +1

Towards Democratizing Joint-Embedding Self-Supervised Learning

1 code implementation3 Mar 2023 Florian Bordes, Randall Balestriero, Pascal Vincent

Joint Embedding Self-Supervised Learning (JE-SSL) has seen rapid developments in recent years, due to its promise to effectively leverage large unlabeled data.

Data Augmentation Misconceptions +1

ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations

no code implementations3 Nov 2022 Badr Youbi Idrissi, Diane Bouchacourt, Randall Balestriero, Ivan Evtimov, Caner Hazirbas, Nicolas Ballas, Pascal Vincent, Michal Drozdzal, David Lopez-Paz, Mark Ibrahim

Equipped with ImageNet-X, we investigate 2, 200 current recognition models and study the types of mistakes as a function of model's (1) architecture, e. g. transformer vs. convolutional, (2) learning paradigm, e. g. supervised vs. self-supervised, and (3) training procedures, e. g., data augmentation.

Data Augmentation

Disentanglement of Correlated Factors via Hausdorff Factorized Support

1 code implementation13 Oct 2022 Karsten Roth, Mark Ibrahim, Zeynep Akata, Pascal Vincent, Diane Bouchacourt

We show that the use of HFS consistently facilitates disentanglement and recovery of ground-truth factors across a variety of correlation settings and benchmarks, even under severe training correlations and correlation shifts, with in parts over $+60\%$ in relative improvement over existing disentanglement methods.

Disentanglement

The Hidden Uniform Cluster Prior in Self-Supervised Learning

1 code implementation13 Oct 2022 Mahmoud Assran, Randall Balestriero, Quentin Duval, Florian Bordes, Ishan Misra, Piotr Bojanowski, Pascal Vincent, Michael Rabbat, Nicolas Ballas

A successful paradigm in representation learning is to perform self-supervised pretraining using tasks based on mini-batch statistics (e. g., SimCLR, VICReg, SwAV, MSN).

Clustering Representation Learning +1

Guillotine Regularization: Why removing layers is needed to improve generalization in Self-Supervised Learning

no code implementations27 Jun 2022 Florian Bordes, Randall Balestriero, Quentin Garrido, Adrien Bardes, Pascal Vincent

This is a little vexing, as one would hope that the network layer at which invariance is explicitly enforced by the SSL criterion during training (the last projector layer) should be the one to use for best generalization performance downstream.

Self-Supervised Learning Transfer Learning

Understanding Dimensional Collapse in Contrastive Self-supervised Learning

1 code implementation ICLR 2022 Li Jing, Pascal Vincent, Yann Lecun, Yuandong Tian

It has been shown that non-contrastive methods suffer from a lesser collapse problem of a different nature: dimensional collapse, whereby the embedding vectors end up spanning a lower-dimensional subspace instead of the entire available embedding space.

Contrastive Learning Learning Theory +2

Online Adversarial Attacks

1 code implementation ICLR 2022 Andjela Mladenovic, Avishek Joey Bose, Hugo Berard, William L. Hamilton, Simon Lacoste-Julien, Pascal Vincent, Gauthier Gidel

Adversarial attacks expose important vulnerabilities of deep learning models, yet little attention has been paid to settings where data arrives as a stream.

Adversarial Attack

Accounting for Variance in Machine Learning Benchmarks

no code implementations1 Mar 2021 Xavier Bouthillier, Pierre Delaunay, Mirko Bronzi, Assya Trofimov, Brennan Nichyporuk, Justin Szeto, Naz Sepah, Edward Raff, Kanika Madan, Vikram Voleti, Samira Ebrahimi Kahou, Vincent Michalski, Dmitriy Serdyuk, Tal Arbel, Chris Pal, Gaël Varoquaux, Pascal Vincent

Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the learning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices.

Benchmarking BIG-bench Machine Learning +1

Efficient Learning in Non-Stationary Linear Markov Decision Processes

no code implementations24 Oct 2020 Ahmed Touati, Pascal Vincent

We study episodic reinforcement learning in non-stationary linear (a. k. a.

Sharp Analysis of Smoothed Bellman Error Embedding

no code implementations7 Jul 2020 Ahmed Touati, Pascal Vincent

The \textit{Smoothed Bellman Error Embedding} algorithm~\citep{dai2018sbeed}, known as SBEED, was proposed as a provably convergent reinforcement learning algorithm with general nonlinear function approximation.

reinforcement-learning Reinforcement Learning +1

Adversarial Example Games

1 code implementation NeurIPS 2020 Avishek Joey Bose, Gauthier Gidel, Hugo Berard, Andre Cianflone, Pascal Vincent, Simon Lacoste-Julien, William L. Hamilton

We introduce Adversarial Example Games (AEG), a framework that models the crafting of adversarial examples as a min-max game between a generator of attacks and a classifier.

Revisiting Loss Modelling for Unstructured Pruning

1 code implementation22 Jun 2020 César Laurent, Camille Ballas, Thomas George, Nicolas Ballas, Pascal Vincent

By removing parameters from deep neural networks, unstructured pruning methods aim at cutting down memory footprint and computational cost, while maintaining prediction accuracy.

Do sequence-to-sequence VAEs learn global features of sentences?

no code implementations EMNLP 2020 Tom Bosc, Pascal Vincent

Using this method, we find that VAEs are prone to memorizing the first words and the sentence length, producing local features of limited usefulness.

Language Modelling Memorization +2

Stable Policy Optimization via Off-Policy Divergence Regularization

1 code implementation9 Mar 2020 Ahmed Touati, Amy Zhang, Joelle Pineau, Pascal Vincent

Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) are among the most successful policy gradient approaches in deep reinforcement learning (RL).

Reinforcement Learning Reinforcement Learning (RL)

A Closer Look at the Optimization Landscapes of Generative Adversarial Networks

1 code implementation ICLR 2020 Hugo Berard, Gauthier Gidel, Amjad Almahairi, Pascal Vincent, Simon Lacoste-Julien

Generative adversarial networks have been very successful in generative modeling, however they remain relatively challenging to train compared to standard deep neural networks.

Stochastic Neural Network with Kronecker Flow

no code implementations10 Jun 2019 Chin-wei Huang, Ahmed Touati, Pascal Vincent, Gintare Karolina Dziugaite, Alexandre Lacoste, Aaron Courville

Recent advances in variational inference enable the modelling of highly structured joint distributions, but are limited in their capacity to scale to the high-dimensional setting of stochastic neural networks.

Thompson Sampling Variational Inference

SVRG for Policy Evaluation with Fewer Gradient Evaluations

1 code implementation9 Jun 2019 Zilun Peng, Ahmed Touati, Pascal Vincent, Doina Precup

SVRG was later shown to work for policy evaluation, a problem in reinforcement learning in which one aims to estimate the value function of a given policy.

Reinforcement Learning Reinforcement Learning (RL)

Fast Approximate Natural Gradient Descent in a Kronecker Factored Eigenbasis

no code implementations NeurIPS 2018 Thomas George, César Laurent, Xavier Bouthillier, Nicolas Ballas, Pascal Vincent

Optimization algorithms that leverage gradient covariance information, such as variants of natural gradient descent (Amari, 1998), offer the prospect of yielding more effective descent directions.

Fast Approximate Natural Gradient Descent in a Kronecker-factored Eigenbasis

6 code implementations11 Jun 2018 Thomas George, César Laurent, Xavier Bouthillier, Nicolas Ballas, Pascal Vincent

Optimization algorithms that leverage gradient covariance information, such as variants of natural gradient descent (Amari, 1998), offer the prospect of yielding more effective descent directions.

Randomized Value Functions via Multiplicative Normalizing Flows

1 code implementation6 Jun 2018 Ahmed Touati, Harsh Satija, Joshua Romoff, Joelle Pineau, Pascal Vincent

In particular, we augment DQN and DDPG with multiplicative normalizing flows in order to track a rich approximate posterior distribution over the parameters of the value function.

Efficient Exploration Thompson Sampling

A Variational Inequality Perspective on Generative Adversarial Networks

1 code implementation ICLR 2019 Gauthier Gidel, Hugo Berard, Gaëtan Vignoud, Pascal Vincent, Simon Lacoste-Julien

Generative adversarial networks (GANs) form a generative modeling approach known for producing appealing samples, but they are notably difficult to train.

Misconceptions

Improving Landmark Localization with Semi-Supervised Learning

no code implementations CVPR 2018 Sina Honari, Pavlo Molchanov, Stephen Tyree, Pascal Vincent, Christopher Pal, Jan Kautz

First, we propose the framework of sequential multitasking and explore it here through an architecture for landmark localization where training with class labels acts as an auxiliary signal to guide the landmark localization on unlabeled data.

Face Alignment Small Data Image Classification

Parametric Adversarial Divergences are Good Losses for Generative Modeling

no code implementations ICLR 2018 Gabriel Huang, Hugo Berard, Ahmed Touati, Gauthier Gidel, Pascal Vincent, Simon Lacoste-Julien

Parametric adversarial divergences, which are a generalization of the losses used to train generative adversarial networks (GANs), have often been described as being approximations of their nonparametric counterparts, such as the Jensen-Shannon divergence, which can be derived under the so-called optimal discriminator assumption.

Structured Prediction

Convergent Tree Backup and Retrace with Function Approximation

no code implementations ICML 2018 Ahmed Touati, Pierre-Luc Bacon, Doina Precup, Pascal Vincent

Off-policy learning is key to scaling up reinforcement learning as it allows to learn about a target policy from the experience generated by a different behavior policy.

Reinforcement Learning

Learning to Generate Samples from Noise through Infusion Training

1 code implementation20 Mar 2017 Florian Bordes, Sina Honari, Pascal Vincent

In this work, we investigate a novel training procedure to learn a generative model as the transition operator of a Markov chain, such that, when applied repeatedly on an unstructured random noise sample, it will denoise it into a sample that matches the target distribution from the training set.

Denoising

A Cheap Linear Attention Mechanism with Fast Lookups and Fixed-Size Representations

no code implementations19 Sep 2016 Alexandre de Brébisson, Pascal Vincent

These two limitations restrict the use of the softmax attention mechanism to relatively small-scale applications with short sequences and few lookups per sequence.

Question Answering

Exact gradient updates in time independent of output size for the spherical loss family

no code implementations26 Jun 2016 Pascal Vincent, Alexandre de Brébisson, Xavier Bouthillier

An important class of problems involves training deep neural networks with sparse prediction targets of very high dimension D. These occur naturally in e. g. neural language models or the learning of word-embeddings, often posed as predicting the probability of next words among a vocabulary of size D (e. g. 200, 000).

Word Embeddings

Hierarchical Memory Networks

no code implementations24 May 2016 Sarath Chandar, Sungjin Ahn, Hugo Larochelle, Pascal Vincent, Gerald Tesauro, Yoshua Bengio

In this paper, we explore a form of hierarchical memory network, which can be considered as a hybrid between hard and soft attention memory networks.

Hard Attention Question Answering +1

Theano: A Python framework for fast computation of mathematical expressions

1 code implementation9 May 2016 The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang

Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements.

BIG-bench Machine Learning Clustering +2

The Z-loss: a shift and scale invariant classification loss belonging to the Spherical Family

1 code implementation29 Apr 2016 Alexandre de Brébisson, Pascal Vincent

In this paper, we introduce an alternative classification loss function, the Z-loss, which is designed to address these two issues.

General Classification Language Modelling

Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation

1 code implementation CVPR 2016 Sina Honari, Jason Yosinski, Pascal Vincent, Christopher Pal

Deep neural networks with alternating convolutional, max-pooling and decimation layers are widely used in state of the art architectures for computer vision.

Image Classification

An Exploration of Softmax Alternatives Belonging to the Spherical Loss Family

1 code implementation16 Nov 2015 Alexandre de Brébisson, Pascal Vincent

In particular, we focus our investigation on spherical bounds of the log-softmax loss and on two spherical log-likelihood losses, namely the log-Spherical Softmax suggested by Vincent et al. (2015) and the log-Taylor Softmax that we introduce.

Language Modelling Multi-class Classification

Artificial Neural Networks Applied to Taxi Destination Prediction

1 code implementation31 Jul 2015 Alexandre de Brébisson, Étienne Simon, Alex Auvolat, Pascal Vincent, Yoshua Bengio

We describe our first-place solution to the ECML/PKDD discovery challenge on taxi destination prediction.

Clustering is Efficient for Approximate Maximum Inner Product Search

no code implementations21 Jul 2015 Alex Auvolat, Sarath Chandar, Pascal Vincent, Hugo Larochelle, Yoshua Bengio

Efficient Maximum Inner Product Search (MIPS) is an important task that has a wide applicability in recommendation systems and classification with a large number of classes.

Clustering Recommendation Systems +2

Dropout as data augmentation

no code implementations29 Jun 2015 Xavier Bouthillier, Kishore Konda, Pascal Vincent, Roland Memisevic

Dropout is typically interpreted as bagging a large number of models sharing parameters.

Data Augmentation

GSNs : Generative Stochastic Networks

no code implementations18 Mar 2015 Guillaume Alain, Yoshua Bengio, Li Yao, Jason Yosinski, Eric Thibodeau-Laufer, Saizheng Zhang, Pascal Vincent

We introduce a novel training principle for probabilistic models that is an alternative to maximum likelihood.

Denoising

Efficient Exact Gradient Update for training Deep Networks with Very Large Sparse Targets

1 code implementation NeurIPS 2015 Pascal Vincent, Alexandre de Brébisson, Xavier Bouthillier

An important class of problems involves training deep neural networks with sparse prediction targets of very high dimension D. These occur naturally in e. g. neural language models or the learning of word-embeddings, often posed as predicting the probability of next words among a vocabulary of size D (e. g. 200 000).

Word Embeddings

Generalized Denoising Auto-Encoders as Generative Models

1 code implementation NeurIPS 2013 Yoshua Bengio, Li Yao, Guillaume Alain, Pascal Vincent

Recent work has shown how denoising and contractive autoencoders implicitly capture the structure of the data-generating density, in the case where the corruption noise is Gaussian, the reconstruction error is the squared error, and the data is continuous-valued.

Denoising valid

Representation Learning: A Review and New Perspectives

6 code implementations24 Jun 2012 Yoshua Bengio, Aaron Courville, Pascal Vincent

The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data.

Density Estimation Representation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.