Search Results for author: Simon Lacoste-Julien

Found 81 papers, 46 papers with code

DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification

no code implementations NeurIPS 2008 Simon Lacoste-Julien, Fei Sha, Michael. I. Jordan

By using the transformed topic mixture proportions as a new representation of documents, we obtain a supervised dimensionality reduction algorithm that uncovers the latent structure in a document collection while preserving predictive power for the task of classification.

Classification General Classification +2

Gaussian Probabilities and Expectation Propagation

no code implementations29 Nov 2011 John P. Cunningham, Philipp Hennig, Simon Lacoste-Julien

We consider these unexpected results empirically and theoretically, both for the problem of Gaussian probabilities and for EP more generally.

SiGMa: Simple Greedy Matching for Aligning Large Knowledge Bases

1 code implementation19 Jul 2012 Simon Lacoste-Julien, Konstantina Palla, Alex Davies, Gjergji Kasneci, Thore Graepel, Zoubin Ghahramani

The Internet has enabled the creation of a growing number of large-scale knowledge bases in a variety of domains containing complementary information.

SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives

5 code implementations NeurIPS 2014 Aaron Defazio, Francis Bach, Simon Lacoste-Julien

In this work we introduce a new optimisation method called SAGA in the spirit of SAG, SDCA, MISO and SVRG, a set of recently proposed incremental gradient algorithms with fast linear convergence rates.

Sequential Kernel Herding: Frank-Wolfe Optimization for Particle Filtering

no code implementations9 Jan 2015 Simon Lacoste-Julien, Fredrik Lindsten, Francis Bach

Recently, the Frank-Wolfe optimization algorithm was suggested as a procedure to obtain adaptive quadrature rules for integrals of functions in a reproducing kernel Hilbert space (RKHS) with a potentially faster rate of convergence than Monte Carlo integration (and "kernel herding" was shown to be a special case of this procedure).

Position

Variance Reduced Stochastic Gradient Descent with Neighbors

no code implementations NeurIPS 2015 Thomas Hofmann, Aurelien Lucchi, Simon Lacoste-Julien, Brian McWilliams

As a side-product we provide a unified convergence analysis for a family of variance reduction algorithms, which we call memorization algorithms.

Memorization

Unsupervised Learning from Narrated Instruction Videos

no code implementations CVPR 2016 Jean-Baptiste Alayrac, Piotr Bojanowski, Nishant Agrawal, Josef Sivic, Ivan Laptev, Simon Lacoste-Julien

Third, we experimentally demonstrate that the proposed method can automatically discover, in an unsupervised manner, the main steps to achieve the task and locate the steps in the input videos.

Clustering

Rethinking LDA: moment matching for discrete ICA

no code implementations NeurIPS 2015 Anastasia Podosinnikova, Francis Bach, Simon Lacoste-Julien

We consider moment matching techniques for estimation in Latent Dirichlet Allocation (LDA).

Barrier Frank-Wolfe for Marginal Inference

1 code implementation NeurIPS 2015 Rahul G. Krishnan, Simon Lacoste-Julien, David Sontag

We introduce a globally-convergent algorithm for optimizing the tree-reweighted (TRW) variational objective over the marginal polytope.

Variational Inference

On the Global Linear Convergence of Frank-Wolfe Optimization Variants

1 code implementation NeurIPS 2015 Simon Lacoste-Julien, Martin Jaggi

In this paper, we highlight and clarify several variants of the Frank-Wolfe optimization algorithm that have been successfully applied in practice: away-steps FW, pairwise FW, fully-corrective FW and Wolfe's minimum norm point algorithm, and prove for the first time that they all enjoy global linear convergence, under a weaker condition than strong convexity of the objective.

Beyond CCA: Moment Matching for Multi-View Models

no code implementations29 Feb 2016 Anastasia Podosinnikova, Francis Bach, Simon Lacoste-Julien

We introduce three novel semi-parametric extensions of probabilistic canonical correlation analysis with identifiability guarantees.

PAC-Bayesian Theory Meets Bayesian Inference

no code implementations NeurIPS 2016 Pascal Germain, Francis Bach, Alexandre Lacoste, Simon Lacoste-Julien

That is, for the negative log-likelihood loss function, we show that the minimization of PAC-Bayesian generalization risk bounds maximizes the Bayesian marginal likelihood.

Bayesian Inference regression

Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs

no code implementations30 May 2016 Anton Osokin, Jean-Baptiste Alayrac, Isabella Lukasewitz, Puneet K. Dokania, Simon Lacoste-Julien

In this paper, we propose several improvements on the block-coordinate Frank-Wolfe (BCFW) algorithm from Lacoste-Julien et al. (2013) recently used to optimize the structured support vector machine (SSVM) objective in the context of structured prediction, though it has wider applications.

Structured Prediction

ASAGA: Asynchronous Parallel SAGA

1 code implementation15 Jun 2016 Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien

We describe ASAGA, an asynchronous parallel version of the incremental gradient algorithm SAGA that enjoys fast linear convergence rates.

Convergence Rate of Frank-Wolfe for Non-Convex Objectives

no code implementations1 Jul 2016 Simon Lacoste-Julien

We give a simple proof that the Frank-Wolfe algorithm obtains a stationary point at a rate of $O(1/\sqrt{t})$ on non-convex objectives with a Lipschitz continuous gradient.

Frank-Wolfe Algorithms for Saddle Point Problems

1 code implementation25 Oct 2016 Gauthier Gidel, Tony Jebara, Simon Lacoste-Julien

We extend the Frank-Wolfe (FW) optimization algorithm to solve constrained smooth convex-concave saddle point (SP) problems.

Structured Prediction

Joint Discovery of Object States and Manipulation Actions

1 code implementation ICCV 2017 Jean-Baptiste Alayrac, Josev Sivic, Ivan Laptev, Simon Lacoste-Julien

We assume a consistent temporal order for the changes in object states and manipulation actions, and introduce new optimization techniques to learn model parameters without additional supervision.

Action Recognition Clustering +2

On Structured Prediction Theory with Calibrated Convex Surrogate Losses

1 code implementation NeurIPS 2017 Anton Osokin, Francis Bach, Simon Lacoste-Julien

We provide novel theoretical insights on structured prediction in the context of efficient convex surrogate loss minimization with consistency guarantees.

Structured Prediction

SEARNN: Training RNNs with Global-Local Losses

1 code implementation ICLR 2018 Rémi Leblond, Jean-Baptiste Alayrac, Anton Osokin, Simon Lacoste-Julien

We propose SEARNN, a novel training algorithm for recurrent neural networks (RNNs) inspired by the "learning to search" (L2S) approach to structured prediction.

Machine Translation Optical Character Recognition (OCR) +3

Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization

1 code implementation NeurIPS 2017 Fabian Pedregosa, Rémi Leblond, Simon Lacoste-Julien

Due to their simplicity and excellent performance, parallel asynchronous variants of stochastic gradient descent have become popular methods to solve a wide range of large-scale optimization problems on multi-core architectures.

Parametric Adversarial Divergences are Good Losses for Generative Modeling

no code implementations ICLR 2018 Gabriel Huang, Hugo Berard, Ahmed Touati, Gauthier Gidel, Pascal Vincent, Simon Lacoste-Julien

Parametric adversarial divergences, which are a generalization of the losses used to train generative adversarial networks (GANs), have often been described as being approximations of their nonparametric counterparts, such as the Jensen-Shannon divergence, which can be derived under the so-called optimal discriminator assumption.

Structured Prediction

Adaptive Stochastic Dual Coordinate Ascent for Conditional Random Fields

no code implementations22 Dec 2017 Rémi Le Priol, Alexandre Piché, Simon Lacoste-Julien

In this paper, we adapt SDCA to train CRFs, and we enhance it with an adaptive non-uniform sampling strategy based on block duality gaps.

Binary Classification General Classification

Improved asynchronous parallel optimization analysis for stochastic incremental methods

no code implementations11 Jan 2018 Rémi Leblond, Fabian Pedregosa, Simon Lacoste-Julien

Notably, we prove that ASAGA and KROMAGNON can obtain a theoretical linear speedup on multi-core systems even without sparsity assumptions.

A3T: Adversarially Augmented Adversarial Training

no code implementations12 Jan 2018 Akram Erraqabi, Aristide Baratin, Yoshua Bengio, Simon Lacoste-Julien

Recent research showed that deep neural networks are highly sensitive to so-called adversarial perturbations, which are tiny perturbations of the input data purposely designed to fool a machine learning classifier.

Adversarial Robustness BIG-bench Machine Learning +1

A Variational Inequality Perspective on Generative Adversarial Networks

1 code implementation ICLR 2019 Gauthier Gidel, Hugo Berard, Gaëtan Vignoud, Pascal Vincent, Simon Lacoste-Julien

Generative adversarial networks (GANs) form a generative modeling approach known for producing appealing samples, but they are notably difficult to train.

Misconceptions

Frank-Wolfe Splitting via Augmented Lagrangian Method

no code implementations9 Apr 2018 Gauthier Gidel, Fabian Pedregosa, Simon Lacoste-Julien

In this work, we develop and analyze the Frank-Wolfe Augmented Lagrangian (FW-AL) algorithm, a method for minimizing a smooth function over convex compact sets related by a "linear consistency" constraint that only requires access to a linear minimization oracle over the individual constraints.

Negative Momentum for Improved Game Dynamics

1 code implementation12 Jul 2018 Gauthier Gidel, Reyhane Askari Hemmat, Mohammad Pezeshki, Remi Lepriol, Gabriel Huang, Simon Lacoste-Julien, Ioannis Mitliagkas

Games generalize the single-objective optimization paradigm by introducing different objective functions for different players.

Scattering Networks for Hybrid Representation Learning

1 code implementation17 Sep 2018 Edouard Oyallon, Sergey Zagoruyko, Gabriel Huang, Nikos Komodakis, Simon Lacoste-Julien, Matthew Blaschko, Eugene Belilovsky

In particular, by working in scattering space, we achieve competitive results both for supervised and unsupervised learning tasks, while making progress towards constructing more interpretable CNNs.

Representation Learning

A Modern Take on the Bias-Variance Tradeoff in Neural Networks

no code implementations19 Oct 2018 Brady Neal, Sarthak Mittal, Aristide Baratin, Vinayak Tantia, Matthew Scicluna, Simon Lacoste-Julien, Ioannis Mitliagkas

The bias-variance tradeoff tells us that as model complexity increases, bias falls and variances increases, leading to a U-shaped test error curve.

Predicting Tactical Solutions to Operational Planning Problems under Imperfect Information

no code implementations22 Jan 2019 Eric Larsen, Sébastien Lachapelle, Yoshua Bengio, Emma Frejinger, Simon Lacoste-Julien, Andrea Lodi

We formulate the problem as a two-stage optimal prediction stochastic program whose solution we predict with a supervised machine learning algorithm.

BIG-bench Machine Learning Management

Are Few-Shot Learning Benchmarks too Simple ? Solving them without Task Supervision at Test-Time

1 code implementation22 Feb 2019 Gabriel Huang, Hugo Larochelle, Simon Lacoste-Julien

We show that several popular few-shot learning benchmarks can be solved with varying degrees of success without using support set Labels at Test-time (LT).

Clustering Few-Shot Learning +1

Reducing Noise in GAN Training with Variance Reduced Extragradient

no code implementations NeurIPS 2019 Tatjana Chavdarova, Gauthier Gidel, François Fleuret, Simon Lacoste-Julien

We study the effect of the stochastic gradient noise on the training of generative adversarial networks (GANs) and show that it can prevent the convergence of standard game optimization methods, while the batch version converges.

Implicit Regularization of Discrete Gradient Dynamics in Linear Neural Networks

1 code implementation NeurIPS 2019 Gauthier Gidel, Francis Bach, Simon Lacoste-Julien

When optimizing over-parameterized models, such as deep neural networks, a large set of parameters can achieve zero training error.

Gradient-Based Neural DAG Learning

1 code implementation ICLR 2020 Sébastien Lachapelle, Philippe Brouillard, Tristan Deleu, Simon Lacoste-Julien

We propose a novel score-based approach to learning a directed acyclic graph (DAG) from observational data.

Causal Inference

A Closer Look at the Optimization Landscapes of Generative Adversarial Networks

1 code implementation ICLR 2020 Hugo Berard, Gauthier Gidel, Amjad Almahairi, Pascal Vincent, Simon Lacoste-Julien

Generative adversarial networks have been very successful in generative modeling, however they remain relatively challenging to train compared to standard deep neural networks.

A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Games

no code implementations13 Jun 2019 Waïss Azizian, Ioannis Mitliagkas, Simon Lacoste-Julien, Gauthier Gidel

We provide new analyses of the EG's local and global convergence properties and use is to get a tighter global convergence rate for OG and CO. Our analysis covers the whole range of settings between bilinear and strongly monotone games.

GAIT: A Geometric Approach to Information Theory

1 code implementation19 Jun 2019 Jose Gallego, Ankit Vani, Max Schwarzer, Simon Lacoste-Julien

We advocate the use of a notion of entropy that reflects the relative abundances of the symbols in an alphabet, as well as the similarities between them.

Are Few-shot Learning Benchmarks Too Simple ?

no code implementations25 Sep 2019 Gabriel Huang, Hugo Larochelle, Simon Lacoste-Julien

We argue that the widely used Omniglot and miniImageNet benchmarks are too simple because their class semantics do not vary across episodes, which defeats their intended purpose of evaluating few-shot classification methods.

Classification Few-Shot Learning

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation

1 code implementation11 Oct 2019 Si Yi Meng, Sharan Vaswani, Issam Laradji, Mark Schmidt, Simon Lacoste-Julien

Under this condition, we show that the regularized subsampled Newton method (R-SSN) achieves global linear convergence with an adaptive step-size and a constant batch-size.

Binary Classification Second-order methods

Stochastic Polyak Step-size for SGD: An Adaptive Learning Rate for Fast Convergence

1 code implementation24 Feb 2020 Nicolas Loizou, Sharan Vaswani, Issam Laradji, Simon Lacoste-Julien

Consequently, the proposed stochastic Polyak step-size (SPS) is an attractive choice for setting the learning rate for stochastic gradient descent (SGD).

To Each Optimizer a Norm, To Each Norm its Generalization

no code implementations11 Jun 2020 Sharan Vaswani, Reza Babanezhad, Jose Gallego, Aaron Mishkin, Simon Lacoste-Julien, Nicolas Le Roux

For under-parameterized linear classification, we prove that for any linear classifier separating the data, there exists a family of quadratic norms ||.||_P such that the classifier's direction is the same as that of the maximum P-margin solution.

Classification General Classification

Adversarial Example Games

1 code implementation NeurIPS 2020 Avishek Joey Bose, Gauthier Gidel, Hugo Berard, Andre Cianflone, Pascal Vincent, Simon Lacoste-Julien, William L. Hamilton

We introduce Adversarial Example Games (AEG), a framework that models the crafting of adversarial examples as a min-max game between a generator of attacks and a classifier.

Differentiable Causal Discovery from Interventional Data

1 code implementation NeurIPS 2020 Philippe Brouillard, Sébastien Lachapelle, Alexandre Lacoste, Simon Lacoste-Julien, Alexandre Drouin

This work constitutes a new step in this direction by proposing a theoretically-grounded method based on neural networks that can leverage interventional data.

Causal Discovery

Flight-connection Prediction for Airline Crew Scheduling to Construct Initial Clusters for OR Optimizer

no code implementations26 Sep 2020 Yassine Yaakoubi, François Soumis, Simon Lacoste-Julien

We present a case study of using machine learning classification algorithms to initialize a large-scale commercial solver (GENCOL) based on column generation in the context of the airline crew pairing problem, where small savings of as little as 1% translate to increasing annual revenue by dozens of millions of dollars in a large airline.

General Classification Imitation Learning +1

Adaptive Gradient Methods Converge Faster with Over-Parameterization (and you can do a line-search)

no code implementations28 Sep 2020 Sharan Vaswani, Issam H. Laradji, Frederik Kunstner, Si Yi Meng, Mark Schmidt, Simon Lacoste-Julien

Under an interpolation assumption, we prove that AMSGrad with a constant step-size and momentum can converge to the minimizer at the faster $O(1/T)$ rate for smooth, convex functions.

Binary Classification

Machine Learning in Airline Crew Pairing to Construct Initial Clusters for Dynamic Constraint Aggregation

no code implementations30 Sep 2020 Yassine Yaakoubi, François Soumis, Simon Lacoste-Julien

The crew pairing problem (CPP) is generally modelled as a set partitioning problem where the flights have to be partitioned in pairings.

BIG-bench Machine Learning

Geometry-Aware Universal Mirror-Prox

no code implementations23 Nov 2020 Reza Babanezhad, Simon Lacoste-Julien

Mirror-prox (MP) is a well-known algorithm to solve variational inequality (VI) problems.

On the Convergence of Continuous Constrained Optimization for Structure Learning

1 code implementation23 Nov 2020 Ignavier Ng, Sébastien Lachapelle, Nan Rosemary Ke, Simon Lacoste-Julien, Kun Zhang

Recently, structure learning of directed acyclic graphs (DAGs) has been formulated as a continuous optimization problem by leveraging an algebraic characterization of acyclicity.

SVRG Meets AdaGrad: Painless Variance Reduction

no code implementations18 Feb 2021 Benjamin Dubois-Taine, Sharan Vaswani, Reza Babanezhad, Mark Schmidt, Simon Lacoste-Julien

Variance reduction (VR) methods for finite-sum minimization typically require the knowledge of problem-dependent constants that are often unknown and difficult to estimate.

Online Adversarial Attacks

1 code implementation ICLR 2022 Andjela Mladenovic, Avishek Joey Bose, Hugo Berard, William L. Hamilton, Simon Lacoste-Julien, Pascal Vincent, Gauthier Gidel

Adversarial attacks expose important vulnerabilities of deep learning models, yet little attention has been paid to settings where data arrives as a stream.

Adversarial Attack

Repurposing Pretrained Models for Robust Out-of-domain Few-Shot Learning

1 code implementation ICLR 2021 Namyeong Kwon, Hwidong Na, Gabriel Huang, Simon Lacoste-Julien

Model-agnostic meta-learning (MAML) is a popular method for few-shot learning but assumes that we have access to the meta-training set.

Few-Shot Learning

Structured Convolutional Kernel Networks for Airline Crew Scheduling

1 code implementation25 May 2021 Yassine Yaakoubi, François Soumis, Simon Lacoste-Julien

Motivated by the needs from an airline crew scheduling application, we introduce structured convolutional kernel networks (Struct-CKN), which combine CKNs from Mairal et al. (2014) in a structured prediction framework that supports constraints on the outputs.

Scheduling Structured Prediction

Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity

1 code implementation NeurIPS 2021 Nicolas Loizou, Hugo Berard, Gauthier Gidel, Ioannis Mitliagkas, Simon Lacoste-Julien

Two of the most prominent algorithms for solving unconstrained smooth games are the classical stochastic gradient descent-ascent (SGDA) and the recently introduced stochastic consensus optimization (SCO) [Mescheder et al., 2017].

Disentanglement via Mechanism Sparsity Regularization: A New Principle for Nonlinear ICA

1 code implementation21 Jul 2021 Sébastien Lachapelle, Pau Rodríguez López, Yash Sharma, Katie Everett, Rémi Le Priol, Alexandre Lacoste, Simon Lacoste-Julien

This work introduces a novel principle we call disentanglement via mechanism sparsity regularization, which can be applied when the latent factors of interest depend sparsely on past latent factors and/or observed auxiliary variables.

Disentanglement

A Survey of Self-Supervised and Few-Shot Object Detection

1 code implementation27 Oct 2021 Gabriel Huang, Issam Laradji, David Vazquez, Simon Lacoste-Julien, Pau Rodriguez

Labeling data is often expensive and time-consuming, especially for tasks such as object detection and instance segmentation, which require dense labeling of the image.

Few-Shot Object Detection Instance Segmentation +3

Convergence Rates for the MAP of an Exponential Family and Stochastic Mirror Descent -- an Open Problem

no code implementations12 Nov 2021 Rémi Le Priol, Frederik Kunstner, Damien Scieur, Simon Lacoste-Julien

We consider the problem of upper bounding the expected log-likelihood sub-optimality of the maximum likelihood estimate (MLE), or a conjugate maximum a posteriori (MAP) for an exponential family, in a non-asymptotic way.

Bayesian Structure Learning with Generative Flow Networks

1 code implementation28 Feb 2022 Tristan Deleu, António Góis, Chris Emezue, Mansi Rankawat, Simon Lacoste-Julien, Stefan Bauer, Yoshua Bengio

In Bayesian structure learning, we are interested in inferring a distribution over the directed acyclic graph (DAG) structure of Bayesian networks, from data.

Variational Inference

Data-Efficient Structured Pruning via Submodular Optimization

1 code implementation9 Mar 2022 Marwa El Halabi, Suraj Srinivas, Simon Lacoste-Julien

Structured pruning is an effective approach for compressing large pre-trained neural networks without significantly affecting their performance.

Partial Disentanglement via Mechanism Sparsity

no code implementations15 Jul 2022 Sébastien Lachapelle, Simon Lacoste-Julien

In this work, we introduce a generalization of this theory which applies to any ground-truth graph and specifies qualitatively how disentangled the learned representation is expected to be, via a new equivalence relation over models we call consistency.

Disentanglement

CrossSplit: Mitigating Label Noise Memorization through Data Splitting

no code implementations3 Dec 2022 JiHye Kim, Aristide Baratin, Yan Zhang, Simon Lacoste-Julien

We approach the problem of improving robustness of deep learning algorithms in the presence of label noise.

Memorization

Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models?

2 code implementations7 Mar 2023 Boris Knyazev, Doha Hwang, Simon Lacoste-Julien

Pretraining a neural network on a large dataset is becoming a cornerstone in machine learning that is within the reach of only a few communities with large-resources.

PopulAtion Parameter Averaging (PAPA)

1 code implementation6 Apr 2023 Alexia Jolicoeur-Martineau, Emy Gervais, Kilian Fatras, Yan Zhang, Simon Lacoste-Julien

Based on this idea, we propose PopulAtion Parameter Averaging (PAPA): a method that combines the generality of ensembling with the efficiency of weight averaging.

On the Identifiability of Quantized Factors

1 code implementation28 Jun 2023 Vitória Barin-Pacela, Kartik Ahuja, Simon Lacoste-Julien, Pascal Vincent

We introduce this novel form of identifiability, termed quantized factor identifiability, and provide a comprehensive proof of the recovery of the quantized factors.

Disentanglement Inductive Bias

Balancing Act: Constraining Disparate Impact in Sparse Models

2 code implementations31 Oct 2023 Meraj Hashemizadeh, Juan Ramirez, Rohan Sukumaran, Golnoosh Farnadi, Simon Lacoste-Julien, Jose Gallego-Posada

Model pruning is a popular approach to enable the deployment of large deep learning models on edge devices with restricted computational or storage capacities.

Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse Actions, Interventions and Sparse Temporal Dependencies

no code implementations10 Jan 2024 Sébastien Lachapelle, Pau Rodríguez López, Yash Sharma, Katie Everett, Rémi Le Priol, Alexandre Lacoste, Simon Lacoste-Julien

We develop a nonparametric identifiability theory that formalizes this principle and shows that the latent factors can be recovered by regularizing the learned causal graph to be sparse.

Disentanglement

Cannot find the paper you are looking for? You can Submit a new open access paper.