Search Results for author: Simon Lacoste-Julien

Found 81 papers, 46 papers with code

DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification

no code implementations • NeurIPS 2008 • Simon Lacoste-Julien, Fei Sha, Michael. I. Jordan

By using the transformed topic mixture proportions as a new representation of documents, we obtain a supervised dimensionality reduction algorithm that uncovers the latent structure in a document collection while preserving predictive power for the task of classification.

Classification General Classification +2

Paper
Add Code

Gaussian Probabilities and Expectation Propagation

no code implementations • 29 Nov 2011 • John P. Cunningham, Philipp Hennig, Simon Lacoste-Julien

We consider these unexpected results empirically and theoretically, both for the problem of Gaussian probabilities and for EP more generally.

Paper
Add Code

SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases

1 code implementation • 19 Jul 2012 • Simon Lacoste-Julien, Konstantina Palla, Alex Davies, Gjergji Kasneci, Thore Graepel, Zoubin Ghahramani

The Internet has enabled the creation of a growing number of large-scale knowledge bases in a variety of domains containing complementary information.

200

Paper
Code

SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives

5 code implementations • NeurIPS 2014 • Aaron Defazio, Francis Bach, Simon Lacoste-Julien

In this work we introduce a new optimisation method called SAGA in the spirit of SAG, SDCA, MISO and SVRG, a set of recently proposed incremental gradient algorithms with fast linear convergence rates.

58,041

Paper
Code

On Pairwise Costs for Network Flow Multi-Object Tracking

no code implementations • CVPR 2015 • Visesh Chari, Simon Lacoste-Julien, Ivan Laptev, Josef Sivic

Multi-object tracking has been recently approached with the min-cost network flow optimization techniques.

Multi-Object Tracking Object

Paper
Add Code

Sequential Kernel Herding: Frank-Wolfe Optimization for Particle Filtering

no code implementations • 9 Jan 2015 • Simon Lacoste-Julien, Fredrik Lindsten, Francis Bach

Recently, the Frank-Wolfe optimization algorithm was suggested as a procedure to obtain adaptive quadrature rules for integrals of functions in a reproducing kernel Hilbert space (RKHS) with a potentially faster rate of convergence than Monte Carlo integration (and "kernel herding" was shown to be a special case of this procedure).

Position

Paper
Add Code

Variance Reduced Stochastic Gradient Descent with Neighbors

no code implementations • NeurIPS 2015 • Thomas Hofmann, Aurelien Lucchi, Simon Lacoste-Julien, Brian McWilliams

As a side-product we provide a unified convergence analysis for a family of variance reduction algorithms, which we call memorization algorithms.

Memorization

Paper
Add Code

Unsupervised Learning from Narrated Instruction Videos

no code implementations • CVPR 2016 • Jean-Baptiste Alayrac, Piotr Bojanowski, Nishant Agrawal, Josef Sivic, Ivan Laptev, Simon Lacoste-Julien

Third, we experimentally demonstrate that the proposed method can automatically discover, in an unsupervised manner, the main steps to achieve the task and locate the steps in the input videos.

Ranked #7 on Temporal Action Localization on CrossTask

Clustering

Paper
Add Code

Rethinking LDA: moment matching for discrete ICA

no code implementations • NeurIPS 2015 • Anastasia Podosinnikova, Francis Bach, Simon Lacoste-Julien

We consider moment matching techniques for estimation in Latent Dirichlet Allocation (LDA).

Paper
Add Code

Barrier Frank-Wolfe for Marginal Inference

1 code implementation • NeurIPS 2015 • Rahul G. Krishnan, Simon Lacoste-Julien, David Sontag

We introduce a globally-convergent algorithm for optimizing the tree-reweighted (TRW) variational objective over the marginal polytope.

Variational Inference

Paper
Code

On the Global Linear Convergence of Frank-Wolfe Optimization Variants

1 code implementation • NeurIPS 2015 • Simon Lacoste-Julien, Martin Jaggi

In this paper, we highlight and clarify several variants of the Frank-Wolfe optimization algorithm that have been successfully applied in practice: away-steps FW, pairwise FW, fully-corrective FW and Wolfe's minimum norm point algorithm, and prove for the first time that they all enjoy global linear convergence, under a weaker condition than strong convexity of the objective.

Paper
Code

Beyond CCA: Moment Matching for Multi-View Models

no code implementations • 29 Feb 2016 • Anastasia Podosinnikova, Francis Bach, Simon Lacoste-Julien

We introduce three novel semi-parametric extensions of probabilistic canonical correlation analysis with identifiability guarantees.

Paper
Add Code

PAC-Bayesian Theory Meets Bayesian Inference

no code implementations • NeurIPS 2016 • Pascal Germain, Francis Bach, Alexandre Lacoste, Simon Lacoste-Julien

That is, for the negative log-likelihood loss function, we show that the minimization of PAC-Bayesian generalization risk bounds maximizes the Bayesian marginal likelihood.

Bayesian Inference regression

Paper
Add Code

Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs

no code implementations • 30 May 2016 • Anton Osokin, Jean-Baptiste Alayrac, Isabella Lukasewitz, Puneet K. Dokania, Simon Lacoste-Julien

In this paper, we propose several improvements on the block-coordinate Frank-Wolfe (BCFW) algorithm from Lacoste-Julien et al. (2013) recently used to optimize the structured support vector machine (SSVM) objective in the context of structured prediction, though it has wider applications.

Structured Prediction

Paper
Add Code

ASAGA: Asynchronous Parallel SAGA

1 code implementation • 15 Jun 2016 • Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien

We describe ASAGA, an asynchronous parallel version of the incremental gradient algorithm SAGA that enjoys fast linear convergence rates.

Paper
Code

Convergence Rate of Frank-Wolfe for Non-Convex Objectives

no code implementations • 1 Jul 2016 • Simon Lacoste-Julien

We give a simple proof that the Frank-Wolfe algorithm obtains a stationary point at a rate of $O(1/\sqrt{t})$ on non-convex objectives with a Lipschitz continuous gradient.

Paper
Add Code

Frank-Wolfe Algorithms for Saddle Point Problems

1 code implementation • 25 Oct 2016 • Gauthier Gidel, Tony Jebara, Simon Lacoste-Julien

We extend the Frank-Wolfe (FW) optimization algorithm to solve constrained smooth convex-concave saddle point (SP) problems.

Structured Prediction

Paper
Code

Joint Discovery of Object States and Manipulation Actions

1 code implementation • ICCV 2017 • Jean-Baptiste Alayrac, Josev Sivic, Ivan Laptev, Simon Lacoste-Julien

We assume a consistent temporal order for the changes in object states and manipulation actions, and introduce new optimization techniques to learn model parameters without additional supervision.

Action Recognition Clustering +2

Paper
Code

On Structured Prediction Theory with Calibrated Convex Surrogate Losses

1 code implementation • NeurIPS 2017 • Anton Osokin, Francis Bach, Simon Lacoste-Julien

We provide novel theoretical insights on structured prediction in the context of efficient convex surrogate loss minimization with consistency guarantees.

Structured Prediction

Paper
Code

SEARNN: Training RNNs with Global-Local Losses

1 code implementation • ICLR 2018 • Rémi Leblond, Jean-Baptiste Alayrac, Anton Osokin, Simon Lacoste-Julien

We propose SEARNN, a novel training algorithm for recurrent neural networks (RNNs) inspired by the "learning to search" (L2S) approach to structured prediction.

Machine Translation Optical Character Recognition (OCR) +3

Paper
Code

A Closer Look at Memorization in Deep Networks

2 code implementations • ICML 2017 • Devansh Arpit, Stanisław Jastrzębski, Nicolas Ballas, David Krueger, Emmanuel Bengio, Maxinder S. Kanwal, Tegan Maharaj, Asja Fischer, Aaron Courville, Yoshua Bengio, Simon Lacoste-Julien

We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness.

Adversarial Robustness Memorization

Paper
Code

Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization

1 code implementation • NeurIPS 2017 • Fabian Pedregosa, Rémi Leblond, Simon Lacoste-Julien

Due to their simplicity and excellent performance, parallel asynchronous variants of stochastic gradient descent have become popular methods to solve a wide range of large-scale optimization problems on multi-core architectures.

Paper
Code

Parametric Adversarial Divergences are Good Losses for Generative Modeling

no code implementations • ICLR 2018 • Gabriel Huang, Hugo Berard, Ahmed Touati, Gauthier Gidel, Pascal Vincent, Simon Lacoste-Julien

Parametric adversarial divergences, which are a generalization of the losses used to train generative adversarial networks (GANs), have often been described as being approximations of their nonparametric counterparts, such as the Jensen-Shannon divergence, which can be derived under the so-called optimal discriminator assumption.

Structured Prediction

Paper
Add Code

Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields

no code implementations • 22 Dec 2017 • Rémi Le Priol, Alexandre Piché, Simon Lacoste-Julien

In this paper, we adapt SDCA to train CRFs, and we enhance it with an adaptive non-uniform sampling strategy based on block duality gaps.

Binary Classification General Classification

Paper
Add Code

Improved asynchronous parallel optimization analysis for stochastic incremental methods

no code implementations • 11 Jan 2018 • Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien

Notably, we prove that ASAGA and KROMAGNON can obtain a theoretical linear speedup on multi-core systems even without sparsity assumptions.

Paper
Add Code

A3T: Adversarially Augmented Adversarial Training

no code implementations • 12 Jan 2018 • Akram Erraqabi, Aristide Baratin, Yoshua Bengio, Simon Lacoste-Julien

Recent research showed that deep neural networks are highly sensitive to so-called adversarial perturbations, which are tiny perturbations of the input data purposely designed to fool a machine learning classifier.

Adversarial Robustness BIG-bench Machine Learning +1

Paper
Add Code

A Variational Inequality Perspective on Generative Adversarial Networks

1 code implementation • ICLR 2019 • Gauthier Gidel, Hugo Berard, Gaëtan Vignoud, Pascal Vincent, Simon Lacoste-Julien

Generative adversarial networks (GANs) form a generative modeling approach known for producing appealing samples, but they are notably difficult to train.

Misconceptions

Paper
Code

Frank-Wolfe Splitting via Augmented Lagrangian Method

no code implementations • 9 Apr 2018 • Gauthier Gidel, Fabian Pedregosa, Simon Lacoste-Julien

In this work, we develop and analyze the Frank-Wolfe Augmented Lagrangian (FW-AL) algorithm, a method for minimizing a smooth function over convex compact sets related by a "linear consistency" constraint that only requires access to a linear minimization oracle over the individual constraints.

Paper
Add Code

Negative Momentum for Improved Game Dynamics

1 code implementation • 12 Jul 2018 • Gauthier Gidel, Reyhane Askari Hemmat, Mohammad Pezeshki, Remi Lepriol, Gabriel Huang, Simon Lacoste-Julien, Ioannis Mitliagkas

Games generalize the single-objective optimization paradigm by introducing different objective functions for different players.

Paper
Code

Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information

no code implementations • 31 Jul 2018 • Eric Larsen, Sébastien Lachapelle, Yoshua Bengio, Emma Frejinger, Simon Lacoste-Julien, Andrea Lodi

We aim to predict at a high speed the expected TDOS associated with the second stage problem, conditionally on the first stage variables.

BIG-bench Machine Learning Stochastic Optimization

Paper
Add Code

Scattering Networks for Hybrid Representation Learning

1 code implementation • 17 Sep 2018 • Edouard Oyallon, Sergey Zagoruyko, Gabriel Huang, Nikos Komodakis, Simon Lacoste-Julien, Matthew Blaschko, Eugene Belilovsky

In particular, by working in scattering space, we achieve competitive results both for supervised and unsupervised learning tasks, while making progress towards constructing more interpretable CNNs.

Representation Learning

Paper
Code

A Modern Take on the Bias-Variance Tradeoff in Neural Networks

no code implementations • 19 Oct 2018 • Brady Neal, Sarthak Mittal, Aristide Baratin, Vinayak Tantia, Matthew Scicluna, Simon Lacoste-Julien, Ioannis Mitliagkas

The bias-variance tradeoff tells us that as model complexity increases, bias falls and variances increases, leading to a U-shaped test error curve.

Paper
Add Code

Quantifying Learning Guarantees for Convex but Inconsistent Surrogates

no code implementations • NeurIPS 2018 • Kirill Struminsky, Simon Lacoste-Julien, Anton Osokin

We study consistency properties of machine learning methods based on minimizing convex surrogates.

General Classification Multi-class Classification

Paper
Add Code

Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information

no code implementations • 22 Jan 2019 • Eric Larsen, Sébastien Lachapelle, Yoshua Bengio, Emma Frejinger, Simon Lacoste-Julien, Andrea Lodi

We formulate the problem as a two-stage optimal prediction stochastic program whose solution we predict with a supervised machine learning algorithm.

BIG-bench Machine Learning Management

Paper
Add Code

Are Few-Shot Learning Benchmarks too Simple ? Solving them without Task Supervision at Test-Time

1 code implementation • 22 Feb 2019 • Gabriel Huang, Hugo Larochelle, Simon Lacoste-Julien

We show that several popular few-shot learning benchmarks can be solved with varying degrees of success without using support set Labels at Test-time (LT).

Clustering Few-Shot Learning +1

Paper
Code

Reducing Noise in GAN Training with Variance Reduced Extragradient

no code implementations • NeurIPS 2019 • Tatjana Chavdarova, Gauthier Gidel, François Fleuret, Simon Lacoste-Julien

We study the effect of the stochastic gradient noise on the training of generative adversarial networks (GANs) and show that it can prevent the convergence of standard game optimization methods, while the batch version converges.

Paper
Add Code

Implicit Regularization of Discrete Gradient Dynamics in Linear Neural Networks

1 code implementation • NeurIPS 2019 • Gauthier Gidel, Francis Bach, Simon Lacoste-Julien

When optimizing over-parameterized models, such as deep neural networks, a large set of parameters can achieve zero training error.

Paper
Code

Painless Stochastic Gradient: Interpolation, Line-Search, and Convergence Rates

1 code implementation • NeurIPS 2019 • Sharan Vaswani, Aaron Mishkin, Issam Laradji, Mark Schmidt, Gauthier Gidel, Simon Lacoste-Julien

To improve the proposed methods' practical performance, we give heuristics to use larger step-sizes and acceleration.

General Classification Multi-class Classification

118

Paper
Code

Gradient-Based Neural DAG Learning

1 code implementation • ICLR 2020 • Sébastien Lachapelle, Philippe Brouillard, Tristan Deleu, Simon Lacoste-Julien

We propose a novel score-based approach to learning a directed acyclic graph (DAG) from observational data.

Causal Inference

Paper
Code

A Closer Look at the Optimization Landscapes of Generative Adversarial Networks

1 code implementation • ICLR 2020 • Hugo Berard, Gauthier Gidel, Amjad Almahairi, Pascal Vincent, Simon Lacoste-Julien

Generative adversarial networks have been very successful in generative modeling, however they remain relatively challenging to train compared to standard deep neural networks.

Paper
Code

A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Games

no code implementations • 13 Jun 2019 • Waïss Azizian, Ioannis Mitliagkas, Simon Lacoste-Julien, Gauthier Gidel

We provide new analyses of the EG's local and global convergence properties and use is to get a tighter global convergence rate for OG and CO. Our analysis covers the whole range of settings between bilinear and strongly monotone games.

Paper
Add Code

GAIT: A Geometric Approach to Information Theory

1 code implementation • 19 Jun 2019 • Jose Gallego, Ankit Vani, Max Schwarzer, Simon Lacoste-Julien

We advocate the use of a notion of entropy that reflects the relative abundances of the symbols in an alphabet, as well as the similarities between them.

Paper
Code

Are Few-shot Learning Benchmarks Too Simple ?

no code implementations • 25 Sep 2019 • Gabriel Huang, Hugo Larochelle, Simon Lacoste-Julien

We argue that the widely used Omniglot and miniImageNet benchmarks are too simple because their class semantics do not vary across episodes, which defeats their intended purpose of evaluating few-shot classification methods.

Classification Few-Shot Learning

Paper
Add Code

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation

1 code implementation • 11 Oct 2019 • Si Yi Meng, Sharan Vaswani, Issam Laradji, Mark Schmidt, Simon Lacoste-Julien

Under this condition, we show that the regularized subsampled Newton method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size.

Binary Classification Second-order methods

Paper
Code

Accelerating Smooth Games by Manipulating Spectral Shapes

no code implementations • 2 Jan 2020 • Waïss Azizian, Damien Scieur, Ioannis Mitliagkas, Simon Lacoste-Julien, Gauthier Gidel

Using this perspective, we propose an optimal algorithm for bilinear games.

Paper
Add Code

Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence

1 code implementation • 24 Feb 2020 • Nicolas Loizou, Sharan Vaswani, Issam Laradji, Simon Lacoste-Julien

Consequently, the proposed stochastic Polyak step-size (SPS) is an attractive choice for setting the learning rate for stochastic gradient descent (SGD).

136

Paper
Code

An Analysis of the Adaptation Speed of Causal Models

1 code implementation • 18 May 2020 • Rémi Le Priol, Reza Babanezhad Harikandeh, Yoshua Bengio, Simon Lacoste-Julien

When the intervention is on the effect variable, we characterize the relative adaptation speed.

Meta-Learning Stochastic Optimization

Paper
Code

Adaptive Gradient Methods Converge Faster with Over-Parameterization (but you should do a line-search)

1 code implementation • 11 Jun 2020 • Sharan Vaswani, Issam Laradji, Frederik Kunstner, Si Yi Meng, Mark Schmidt, Simon Lacoste-Julien

In this setting, we prove that AMSGrad with constant step-size and momentum converges to the minimizer at a faster $O(1/T)$ rate.

Binary Classification Multi-class Classification

Paper
Code

To Each Optimizer a Norm, To Each Norm its Generalization

no code implementations • 11 Jun 2020 • Sharan Vaswani, Reza Babanezhad, Jose Gallego, Aaron Mishkin, Simon Lacoste-Julien, Nicolas Le Roux

For under-parameterized linear classification, we prove that for any linear classifier separating the data, there exists a family of quadratic norms ||.||_P such that the classifier's direction is the same as that of the maximum P-margin solution.

Classification General Classification

Paper
Add Code

Adversarial Example Games

1 code implementation • NeurIPS 2020 • Avishek Joey Bose, Gauthier Gidel, Hugo Berard, Andre Cianflone, Pascal Vincent, Simon Lacoste-Julien, William L. Hamilton

We introduce Adversarial Example Games (AEG), a framework that models the crafting of adversarial examples as a min-max game between a generator of attacks and a classifier.

Paper
Code

Differentiable Causal Discovery from Interventional Data

1 code implementation • NeurIPS 2020 • Philippe Brouillard, Sébastien Lachapelle, Alexandre Lacoste, Simon Lacoste-Julien, Alexandre Drouin

This work constitutes a new step in this direction by proposing a theoretically-grounded method based on neural networks that can leverage interventional data.

Causal Discovery

Paper
Code

Stochastic Hamiltonian Gradient Methods for Smooth Games

no code implementations • ICML 2020 • Nicolas Loizou, Hugo Berard, Alexia Jolicoeur-Martineau, Pascal Vincent, Simon Lacoste-Julien, Ioannis Mitliagkas

The success of adversarial formulations in machine learning has brought renewed motivation for smooth games.

BIG-bench Machine Learning

Paper
Add Code

Implicit Regularization via Neural Feature Alignment

1 code implementation • NeurIPS Workshop DL-IG 2020 • Aristide Baratin, Thomas George, César Laurent, R. Devon Hjelm, Guillaume Lajoie, Pascal Vincent, Simon Lacoste-Julien

We approach the problem of implicit regularization in deep learning from a geometrical viewpoint.

feature selection Model Compression

Paper
Code

Flight-connection Prediction for Airline Crew Scheduling to Construct Initial Clusters for OR Optimizer

no code implementations • 26 Sep 2020 • Yassine Yaakoubi, François Soumis, Simon Lacoste-Julien

We present a case study of using machine learning classification algorithms to initialize a large-scale commercial solver (GENCOL) based on column generation in the context of the airline crew pairing problem, where small savings of as little as 1% translate to increasing annual revenue by dozens of millions of dollars in a large airline.

General Classification Imitation Learning +1

Paper
Add Code

Adaptive Gradient Methods Converge Faster with Over-Parameterization (and you can do a line-search)

no code implementations • 28 Sep 2020 • Sharan Vaswani, Issam H. Laradji, Frederik Kunstner, Si Yi Meng, Mark Schmidt, Simon Lacoste-Julien

Under an interpolation assumption, we prove that AMSGrad with a constant step-size and momentum can converge to the minimizer at the faster $O(1/T)$ rate for smooth, convex functions.

Binary Classification

Paper
Add Code

Machine Learning in Airline Crew Pairing to Construct Initial Clusters for Dynamic Constraint Aggregation

no code implementations • 30 Sep 2020 • Yassine Yaakoubi, François Soumis, Simon Lacoste-Julien

The crew pairing problem (CPP) is generally modelled as a set partitioning problem where the flights have to be partitioned in pairings.

BIG-bench Machine Learning

Paper
Add Code

Geometry-Aware Universal Mirror-Prox

no code implementations • 23 Nov 2020 • Reza Babanezhad, Simon Lacoste-Julien

Mirror-prox (MP) is a well-known algorithm to solve variational inequality (VI) problems.

Paper
Add Code

On the Convergence of Continuous Constrained Optimization for Structure Learning

1 code implementation • 23 Nov 2020 • Ignavier Ng, Sébastien Lachapelle, Nan Rosemary Ke, Simon Lacoste-Julien, Kun Zhang

Recently, structure learning of directed acyclic graphs (DAGs) has been formulated as a continuous optimization problem by leveraging an algebraic characterization of acyclicity.

Paper
Code

SVRG Meets AdaGrad: Painless Variance Reduction

no code implementations • 18 Feb 2021 • Benjamin Dubois-Taine, Sharan Vaswani, Reza Babanezhad, Mark Schmidt, Simon Lacoste-Julien

Variance reduction (VR) methods for finite-sum minimization typically require the knowledge of problem-dependent constants that are often unknown and difficult to estimate.

Paper
Add Code

Online Adversarial Attacks

1 code implementation • ICLR 2022 • Andjela Mladenovic, Avishek Joey Bose, Hugo Berard, William L. Hamilton, Simon Lacoste-Julien, Pascal Vincent, Gauthier Gidel

Adversarial attacks expose important vulnerabilities of deep learning models, yet little attention has been paid to settings where data arrives as a stream.

Adversarial Attack

Paper
Code

Repurposing Pretrained Models for Robust Out-of-domain Few-Shot Learning

1 code implementation • ICLR 2021 • Namyeong Kwon, Hwidong Na, Gabriel Huang, Simon Lacoste-Julien

Model-agnostic meta-learning (MAML) is a popular method for few-shot learning but assumes that we have access to the meta-training set.

Few-Shot Learning

Paper
Code

Structured Convolutional Kernel Networks for Airline Crew Scheduling

1 code implementation • 25 May 2021 • Yassine Yaakoubi, François Soumis, Simon Lacoste-Julien

Motivated by the needs from an airline crew scheduling application, we introduce structured convolutional kernel networks (Struct-CKN), which combine CKNs from Mairal et al. (2014) in a structured prediction framework that supports constraints on the outputs.

Scheduling Structured Prediction

Paper
Code

Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity

1 code implementation • NeurIPS 2021 • Nicolas Loizou, Hugo Berard, Gauthier Gidel, Ioannis Mitliagkas, Simon Lacoste-Julien

Two of the most prominent algorithms for solving unconstrained smooth games are the classical stochastic gradient descent-ascent (SGDA) and the recently introduced stochastic consensus optimization (SCO) [Mescheder et al., 2017].

Paper
Code

Disentanglement via Mechanism Sparsity Regularization: A New Principle for Nonlinear ICA

1 code implementation • 21 Jul 2021 • Sébastien Lachapelle, Pau Rodríguez López, Yash Sharma, Katie Everett, Rémi Le Priol, Alexandre Lacoste, Simon Lacoste-Julien

This work introduces a novel principle we call disentanglement via mechanism sparsity regularization, which can be applied when the latent factors of interest depend sparsely on past latent factors and/or observed auxiliary variables.

Disentanglement

Paper
Code

A Survey of Self-Supervised and Few-Shot Object Detection

1 code implementation • 27 Oct 2021 • Gabriel Huang, Issam Laradji, David Vazquez, Simon Lacoste-Julien, Pau Rodriguez

Labeling data is often expensive and time-consuming, especially for tasks such as object detection and instance segmentation, which require dense labeling of the image.

Few-Shot Object Detection Instance Segmentation +3

Paper
Code

Convergence Rates for the MAP of an Exponential Family and Stochastic Mirror Descent -- an Open Problem

no code implementations • 12 Nov 2021 • Rémi Le Priol, Frederik Kunstner, Damien Scieur, Simon Lacoste-Julien

We consider the problem of upper bounding the expected log-likelihood sub-optimality of the maximum likelihood estimate (MLE), or a conjugate maximum a posteriori (MAP) for an exponential family, in a non-asymptotic way.

Paper
Add Code

Multiset-Equivariant Set Prediction with Approximate Implicit Differentiation

1 code implementation • ICLR 2022 • Yan Zhang, David W. Zhang, Simon Lacoste-Julien, Gertjan J. Burghouts, Cees G. M. Snoek

Most set prediction models in deep learning use set-equivariant operations, but they actually operate on multisets.

Property Prediction

Paper
Code

Bayesian Structure Learning with Generative Flow Networks

1 code implementation • 28 Feb 2022 • Tristan Deleu, António Góis, Chris Emezue, Mansi Rankawat, Simon Lacoste-Julien, Stefan Bauer, Yoshua Bengio

In Bayesian structure learning, we are interested in inferring a distribution over the directed acyclic graph (DAG) structure of Bayesian networks, from data.

Variational Inference

Paper
Code

Data-Efficient Structured Pruning via Submodular Optimization

1 code implementation • 9 Mar 2022 • Marwa El Halabi, Suraj Srinivas, Simon Lacoste-Julien

Structured pruning is an effective approach for compressing large pre-trained neural networks without significantly affecting their performance.

Paper
Code

Partial Disentanglement via Mechanism Sparsity

no code implementations • 15 Jul 2022 • Sébastien Lachapelle, Simon Lacoste-Julien

In this work, we introduce a generalization of this theory which applies to any ground-truth graph and specifies qualitatively how disentangled the learned representation is expected to be, via a new equivalence relation over models we call consistency.

Disentanglement

Paper
Add Code

Controlled Sparsity via Constrained Optimization or: How I Learned to Stop Tuning Penalties and Love Constraints

1 code implementation • 8 Aug 2022 • Jose Gallego-Posada, Juan Ramirez, Akram Erraqabi, Yoshua Bengio, Simon Lacoste-Julien

The performance of trained neural networks is robust to harsh levels of pruning.

Sparse Learning

Paper
Code

Synergies between Disentanglement and Sparsity: Generalization and Identifiability in Multi-Task Learning

1 code implementation • 26 Nov 2022 • Sébastien Lachapelle, Tristan Deleu, Divyat Mahajan, Ioannis Mitliagkas, Yoshua Bengio, Simon Lacoste-Julien, Quentin Bertrand

Although disentangled representations are often said to be beneficial for downstream tasks, current empirical and theoretical understanding is limited.

Disentanglement Meta-Learning +1

Paper
Code

CrossSplit: Mitigating Label Noise Memorization through Data Splitting

no code implementations • 3 Dec 2022 • JiHye Kim, Aristide Baratin, Yan Zhang, Simon Lacoste-Julien

We approach the problem of improving robustness of deep learning algorithms in the presence of label noise.

Memorization

Paper
Add Code

Unlocking Slot Attention by Changing Optimal Transport Costs

1 code implementation • 30 Jan 2023 • Yan Zhang, David W. Zhang, Simon Lacoste-Julien, Gertjan J. Burghouts, Cees G. M. Snoek

Slot attention is a powerful method for object-centric modeling in images and videos.

Object

Paper
Code

Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?

2 code implementations • 7 Mar 2023 • Boris Knyazev, Doha Hwang, Simon Lacoste-Julien

Pretraining a neural network on a large dataset is becoming a cornerstone in machine learning that is within the reach of only a few communities with large-resources.

482

Paper
Code

PopulAtion Parameter Averaging (PAPA)

1 code implementation • 6 Apr 2023 • Alexia Jolicoeur-Martineau, Emy Gervais, Kilian Fatras, Yan Zhang, Simon Lacoste-Julien

Based on this idea, we propose PopulAtion Parameter Averaging (PAPA): a method that combines the generality of ensembling with the efficiency of weight averaging.

Paper
Code

On the Identifiability of Quantized Factors

1 code implementation • 28 Jun 2023 • Vitória Barin-Pacela, Kartik Ahuja, Simon Lacoste-Julien, Pascal Vincent

We introduce this novel form of identifiability, termed quantized factor identifiability, and provide a comprehensive proof of the recovery of the quantized factors.

Disentanglement Inductive Bias

Paper
Code

Promoting Exploration in Memory-Augmented Adam using Critical Momenta

1 code implementation • 18 Jul 2023 • Pranshu Malviya, Gonçalo Mordido, Aristide Baratin, Reza Babanezhad Harikandeh, Jerry Huang, Simon Lacoste-Julien, Razvan Pascanu, Sarath Chandar

Adaptive gradient-based optimizers, particularly Adam, have left their mark in training large-scale deep learning models.

Image Classification Language Modelling

Paper
Code

Balancing Act: Constraining Disparate Impact in Sparse Models

2 code implementations • 31 Oct 2023 • Meraj Hashemizadeh, Juan Ramirez, Rohan Sukumaran, Golnoosh Farnadi, Simon Lacoste-Julien, Jose Gallego-Posada

Model pruning is a popular approach to enable the deployment of large deep learning models on edge devices with restricted computational or storage capacities.

Paper
Code

Weight-Sharing Regularization

1 code implementation • 6 Nov 2023 • Mehran Shakerinava, Motahareh Sohrabi, Siamak Ravanbakhsh, Simon Lacoste-Julien

Weight-sharing is ubiquitous in deep learning.

Paper
Code

Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse Actions, Interventions and Sparse Temporal Dependencies

no code implementations • 10 Jan 2024 • Sébastien Lachapelle, Pau Rodríguez López, Yash Sharma, Katie Everett, Rémi Le Priol, Alexandre Lacoste, Simon Lacoste-Julien

We develop a nonparametric identifiability theory that formalizes this principle and shows that the latent factors can be recovered by regularizing the learned causal graph to be sparse.

Disentanglement

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.