Search Results for author: Olivier Bousquet

Found 35 papers, 10 papers with code

The Tradeoffs of Large Scale Learning

no code implementations • NeurIPS 2007 • Léon Bottou, Olivier Bousquet

This contribution develops a theoretical framework that takes into account the effect of approximate optimization on learning algorithms.

Paper
Add Code

AdaGAN: Boosting Generative Models

1 code implementation • NeurIPS 2017 • Ilya Tolstikhin, Sylvain Gelly, Olivier Bousquet, Carl-Johann Simon-Gabriel, Bernhard Schölkopf

Generative Adversarial Networks (GAN) (Goodfellow et al., 2014) are an effective method for training generative models of complex data such as natural images.

Paper
Code

From optimal transport to generative modeling: the VEGAN cookbook

1 code implementation • 22 May 2017 • Olivier Bousquet, Sylvain Gelly, Ilya Tolstikhin, Carl-Johann Simon-Gabriel, Bernhard Schoelkopf

We study unsupervised generative modeling in terms of the optimal transport (OT) problem between true (but unknown) data distribution $P_X$ and the latent variable model distribution $P_G$.

Paper
Code

Better Text Understanding Through Image-To-Text Transfer

no code implementations • 23 May 2017 • Karol Kurach, Sylvain Gelly, Michal Jastrzebski, Philip Haeusser, Olivier Teytaud, Damien Vincent, Olivier Bousquet

Generic text embeddings are successfully used in a variety of tasks.

Paper
Add Code

Approximation and Convergence Properties of Generative Adversarial Learning

no code implementations • NeurIPS 2017 • Shuang Liu, Olivier Bousquet, Kamalika Chaudhuri

In this paper, we address these questions in a broad and unified setting by defining a notion of adversarial divergences that includes a number of recently proposed objective functions.

Paper
Add Code

Toward Optimal Run Racing: Application to Deep Learning Calibration

no code implementations • 10 Jun 2017 • Olivier Bousquet, Sylvain Gelly, Karol Kurach, Marc Schoenauer, Michele Sebag, Olivier Teytaud, Damien Vincent

This paper aims at one-shot learning of deep neural nets, where a highly parallel setting is considered to address the algorithm calibration problem - selecting the best neural architecture and learning hyper-parameter values depending on the dataset at hand.

One-Shot Learning Two-sample testing

Paper
Add Code

Critical Hyper-Parameters: No Random, No Cry

no code implementations • 10 Jun 2017 • Olivier Bousquet, Sylvain Gelly, Karol Kurach, Olivier Teytaud, Damien Vincent

The selection of hyper-parameters is critical in Deep Learning.

Image Classification

Paper
Add Code

Wasserstein Auto-Encoders

13 code implementations • ICLR 2018 • Ilya Tolstikhin, Olivier Bousquet, Sylvain Gelly, Bernhard Schoelkopf

We propose the Wasserstein Auto-Encoder (WAE)---a new algorithm for building a generative model of the data distribution.

5,984

Paper
Code

Are GANs Created Equal? A Large-Scale Study

9 code implementations • NeurIPS 2018 • Mario Lucic, Karol Kurach, Marcin Michalski, Sylvain Gelly, Olivier Bousquet

Generative adversarial networks (GAN) are a powerful subclass of generative models.

Hyperparameter Optimization

3,059

Paper
Code

Online Hyper-Parameter Optimization

no code implementations • ICLR 2018 • Damien Vincent, Sylvain Gelly, Nicolas Le Roux, Olivier Bousquet

We propose an efficient online hyperparameter optimization method which uses a joint dynamical system to evaluate the gradient with respect to the hyperparameters.

Hyperparameter Optimization

Paper
Add Code

Gradient Descent Quantizes ReLU Network Features

no code implementations • 22 Mar 2018 • Hartmut Maennel, Olivier Bousquet, Sylvain Gelly

Deep neural networks are often trained in the over-parametrized regime (i. e. with far more parameters than training examples), and understanding why the training converges to solutions that generalize remains an open problem.

Quantization

Paper
Add Code

Assessing Generative Models via Precision and Recall

4 code implementations • NeurIPS 2018 • Mehdi S. M. Sajjadi, Olivier Bachem, Mario Lucic, Olivier Bousquet, Sylvain Gelly

Recent advances in generative modeling have led to an increased interest in the study of statistical divergences as means of model comparison.

Paper
Code

Synthetic Data Generators: Sequential and Private

no code implementations • 9 Feb 2019 • Olivier Bousquet, Roi Livni, Shay Moran

We study the sample complexity of private synthetic data generation over an unbounded sized class of statistical queries, and show that any class that is privately proper PAC learnable admits a private synthetic data generator (perhaps non-efficient).

Synthetic Data Generation

Paper
Add Code

The Optimal Approximation Factor in Density Estimation

no code implementations • 10 Feb 2019 • Olivier Bousquet, Daniel Kane, Shay Moran

We complement and extend this result by showing that: (i) the factor 3 can not be improved if one restricts the algorithm to output a density from $\mathcal{Q}$, and (ii) if one allows the algorithm to output arbitrary densities (e. g.\ a mixture of densities from $\mathcal{Q}$), then the approximation factor can be reduced to 2, which is optimal.

Density Estimation

Paper
Add Code

Precision-Recall Curves Using Information Divergence Frontiers

no code implementations • 26 May 2019 • Josip Djolonga, Mario Lucic, Marco Cuturi, Olivier Bachem, Olivier Bousquet, Sylvain Gelly

Despite the tremendous progress in the estimation of generative models, the development of tools for diagnosing their failures and assessing their performance has advanced at a much slower pace.

Image Generation Information Retrieval +1

Paper
Add Code

Practical and Consistent Estimation of f-Divergences

1 code implementation • NeurIPS 2019 • Paul K. Rubenstein, Olivier Bousquet, Josip Djolonga, Carlos Riquelme, Ilya Tolstikhin

The estimation of an f-divergence between two probability distributions based on samples is a fundamental problem in statistics and machine learning.

BIG-bench Machine Learning Mutual Information Estimation +1

32,783

Paper
Code

When can unlabeled data improve the learning rate?

no code implementations • 28 May 2019 • Christina Göpfert, Shai Ben-David, Olivier Bousquet, Sylvain Gelly, Ilya Tolstikhin, Ruth Urner

In semi-supervised classification, one is given access both to labeled and unlabeled data.

Paper
Add Code

Google Research Football: A Novel Reinforcement Learning Environment

1 code implementation • 25 Jul 2019 • Karol Kurach, Anton Raichuk, Piotr Stańczyk, Michał Zając, Olivier Bachem, Lasse Espeholt, Carlos Riquelme, Damien Vincent, Marcin Michalski, Olivier Bousquet, Sylvain Gelly

Recent progress in the field of reinforcement learning has been accelerated by virtual learning environments such as video games, where novel algorithms and ideas can be quickly tested in a safe and reproducible manner.

Game of Football reinforcement-learning +1

3,249

Paper
Code

The Visual Task Adaptation Benchmark

no code implementations • 25 Sep 2019 • Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby

Representation learning promises to unlock deep learning for the long tail of vision tasks without expansive labelled datasets.

Representation Learning

Paper
Add Code

A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark

2 code implementations • arXiv 2020 • Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby

And, how close are we to general visual representations?

Ranked #10 on Image Classification on VTAB-1k (using extra training data)

Image Classification Representation Learning

3,228

Paper
Code

Sharper bounds for uniformly stable algorithms

no code implementations • 17 Oct 2019 • Olivier Bousquet, Yegor Klochkov, Nikita Zhivotovskiy

In a series of recent breakthrough papers by Feldman and Vondrak (2018, 2019), it was shown that the best known high probability upper bounds for uniformly stable learning algorithms due to Bousquet and Elisseef (2002) are sub-optimal in some natural regimes.

Generalization Bounds Learning Theory

Paper
Add Code

Fast classification rates without standard margin assumptions

no code implementations • 28 Oct 2019 • Olivier Bousquet, Nikita Zhivotovskiy

First, we consider classification with a reject option, namely Chow's reject option model, and show that by slightly lowering the impact of hard instances, a learning rate of order $O\left(\frac{d}{n}\log \frac{n}{d}\right)$ is always achievable in the agnostic setting by a specific learning algorithm.

Classification General Classification +1

Paper
Add Code

Measuring Compositional Generalization: A Comprehensive Method on Realistic Data

3 code implementations • ICLR 2020 • Daniel Keysers, Nathanael Schärli, Nathan Scales, Hylke Buisman, Daniel Furrer, Sergii Kashubin, Nikola Momchev, Danila Sinopalnikov, Lukasz Stafiniak, Tibor Tihon, Dmitry Tsarkov, Xiao Wang, Marc van Zee, Olivier Bousquet

We present a large and realistic natural language question answering dataset that is constructed according to this method, and we use it to analyze the compositional generalization ability of three machine learning architectures.

Ranked #5 on Semantic Parsing on CFQ

BIG-bench Machine Learning Question Answering +1

32,781

Paper
Code

Predicting Neural Network Accuracy from Weights

1 code implementation • 26 Feb 2020 • Thomas Unterthiner, Daniel Keysers, Sylvain Gelly, Olivier Bousquet, Ilya Tolstikhin

Furthermore, the predictors are able to rank networks trained on different, unobserved datasets and with different architectures.

Paper
Code

Proper Learning, Helly Number, and an Optimal SVM Bound

no code implementations • 24 May 2020 • Olivier Bousquet, Steve Hanneke, Shay Moran, Nikita Zhivotovskiy

It has been recently shown by Hanneke (2016) that the optimal sample complexity of PAC learning for any VC class C is achieved by a particular improper learning algorithm, which outputs a specific majority-vote of hypotheses in C. This leaves the question of when this bound can be achieved by proper learning algorithms, which are restricted to always output a hypothesis from C. In this paper we aim to characterize the classes for which the optimal sample complexity can be achieved by a proper learning algorithm.

PAC learning

Paper
Add Code

What Do Neural Networks Learn When Trained With Random Labels?

no code implementations • NeurIPS 2020 • Hartmut Maennel, Ibrahim Alabdulmohsin, Ilya Tolstikhin, Robert J. N. Baldock, Olivier Bousquet, Sylvain Gelly, Daniel Keysers

We show how this alignment produces a positive transfer: networks pre-trained with random labels train faster downstream compared to training from scratch even after accounting for simple effects, such as weight scaling.

Memorization

Paper
Add Code

A Theory of Universal Learning

no code implementations • 9 Nov 2020 • Olivier Bousquet, Steve Hanneke, Shay Moran, Ramon van Handel, Amir Yehudayoff

How quickly can a given class of concepts be learned from examples?

BIG-bench Machine Learning

Paper
Add Code

Synthetic Data Generators -- Sequential and Private

no code implementations • NeurIPS 2020 • Olivier Bousquet, Roi Livni, Shay Moran

Synthetic Data Generation

Paper
Add Code

Statistically Near-Optimal Hypothesis Selection

no code implementations • 17 Aug 2021 • Olivier Bousquet, Mark Braverman, Klim Efremenko, Gillat Kol, Shay Moran

We derive an optimal $2$-approximation learning strategy for the Hypothesis Selection problem, outputting $q$ such that $\mathsf{TV}(p, q) \leq2 \cdot opt + \eps$, with a (nearly) optimal sample complexity of~$\tilde O(\log n/\epsilon^2)$.

PAC learning

Paper
Add Code

Monotone Learning

no code implementations • 10 Feb 2022 • Olivier Bousquet, Amit Daniely, Haim Kaplan, Yishay Mansour, Shay Moran, Uri Stemmer

Our transformation readily implies monotone learners in a variety of contexts: for example it extends Pestov's result to classification tasks with an arbitrary number of labels.

Binary Classification Classification +1

Paper
Add Code

Least-to-Most Prompting Enables Complex Reasoning in Large Language Models

no code implementations • 21 May 2022 • Denny Zhou, Nathanael Schärli, Le Hou, Jason Wei, Nathan Scales, Xuezhi Wang, Dale Schuurmans, Claire Cui, Olivier Bousquet, Quoc Le, Ed Chi

Chain-of-thought prompting has demonstrated remarkable performance on various natural language reasoning tasks.

Ranked #96 on Arithmetic Reasoning on GSM8K

Arithmetic Reasoning Math

Paper
Add Code

Fine-Grained Distribution-Dependent Learning Curves

no code implementations • 31 Aug 2022 • Olivier Bousquet, Steve Hanneke, Shay Moran, Jonathan Shafer, Ilya Tolstikhin

We solve this problem in a principled manner, by introducing a combinatorial dimension called VCL that characterizes the best $d'$ for which $d'/n$ is a strong minimax lower bound.

Learning Theory PAC learning

Paper
Add Code

Compositional Semantic Parsing with Large Language Models

no code implementations • 29 Sep 2022 • Andrew Drozdov, Nathanael Schärli, Ekin Akyürek, Nathan Scales, Xinying Song, Xinyun Chen, Olivier Bousquet, Denny Zhou

Humans can reason compositionally when presented with new tasks.

Ranked #1 on Semantic Parsing on CFQ

Semantic Parsing

Paper
Add Code

The Dynamics of Sharpness-Aware Minimization: Bouncing Across Ravines and Drifting Towards Wide Minima

no code implementations • 4 Oct 2022 • Peter L. Bartlett, Philip M. Long, Olivier Bousquet

We consider Sharpness-Aware Minimization (SAM), a gradient-based optimization method for deep networks that has exhibited performance improvements on image and language prediction problems.

Paper
Add Code

Differentially-Private Bayes Consistency

no code implementations • 8 Dec 2022 • Olivier Bousquet, Haim Kaplan, Aryeh Kontorovich, Yishay Mansour, Shay Moran, Menachem Sadigurschi, Uri Stemmer

We construct a universally Bayes consistent learning rule that satisfies differential privacy (DP).

Binary Classification Density Estimation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.