Search Results for author: David Lopez-Paz

Found 47 papers, 25 papers with code

Better & Faster Large Language Models via Multi-token Prediction

no code implementations30 Apr 2024 Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Rozière, David Lopez-Paz, Gabriel Synnaeve

More specifically, at each position in the training corpus, we ask the model to predict the following n tokens using n independent output heads, operating on top of a shared model trunk.

Unified Uncertainty Calibration

1 code implementation2 Oct 2023 Kamalika Chaudhuri, David Lopez-Paz

To build robust, fair, and safe AI systems, we would like our classifiers to say ``I don't know'' when facing test examples that are difficult or fall outside of the training classes. The ubiquitous strategy to predict under uncertainty is the simplistic \emph{reject-or-classify} rule: abstain from prediction if epistemic uncertainty is high, classify otherwise. Unfortunately, this recipe does not allow different sources of uncertainty to communicate with each other, produces miscalibrated predictions, and it does not allow to correct for misspecifications in our uncertainty estimates.

A Closer Look at In-Context Learning under Distribution Shifts

1 code implementation26 May 2023 Kartik Ahuja, David Lopez-Paz

In-context learning, a capability that enables a model to learn from input examples on the fly without necessitating weight updates, is a defining characteristic of large language models.

In-Context Learning

Model Ratatouille: Recycling Diverse Models for Out-of-Distribution Generalization

1 code implementation20 Dec 2022 Alexandre Ramé, Kartik Ahuja, Jianyu Zhang, Matthieu Cord, Léon Bottou, David Lopez-Paz

In this paper, we thus propose model ratatouille, a new strategy to recycle the multiple fine-tunings of the same foundation model on diverse auxiliary tasks.

Domain Generalization Out-of-Distribution Generalization

ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations

no code implementations3 Nov 2022 Badr Youbi Idrissi, Diane Bouchacourt, Randall Balestriero, Ivan Evtimov, Caner Hazirbas, Nicolas Ballas, Pascal Vincent, Michal Drozdzal, David Lopez-Paz, Mark Ibrahim

Equipped with ImageNet-X, we investigate 2, 200 current recognition models and study the types of mistakes as a function of model's (1) architecture, e. g. transformer vs. convolutional, (2) learning paradigm, e. g. supervised vs. self-supervised, and (3) training procedures, e. g., data augmentation.

Data Augmentation

Measuring and signing fairness as performance under multiple stakeholder distributions

no code implementations20 Jul 2022 David Lopez-Paz, Diane Bouchacourt, Levent Sagun, Nicolas Usunier

By highlighting connections to the literature in domain generalization, we propose to measure fairness as the ability of the system to generalize under multiple stress tests -- distributions of examples with social relevance.

Domain Generalization Fairness

Why does Throwing Away Data Improve Worst-Group Error?

no code implementations23 May 2022 Kamalika Chaudhuri, Kartik Ahuja, Martin Arjovsky, David Lopez-Paz

When facing data with imbalanced classes or groups, practitioners follow an intriguing strategy to achieve best results.

Fairness imbalanced classification +1

An Empirical Investigation of Domain Generalization with Empirical Risk Minimizers

no code implementations NeurIPS 2021 Ramakrishna Vedantam, David Lopez-Paz, David J. Schwab

Recent work demonstrates that deep neural networks trained using Empirical Risk Minimization (ERM) can generalize under distribution shift, outperforming specialized training algorithms for domain generalization.

Domain Generalization Out-of-Distribution Generalization

What classifiers know what they don't know?

no code implementations29 Sep 2021 Mohamed Ishmael Belghazi, David Lopez-Paz

Adding new datasets, algorithms, measures, or metrics is a matter of a few lines of code-in so hoping that UIMNET becomes a stepping stone towards realistic, rigorous, and reproducible research in uncertainty estimation.

Decision Making

What classifiers know what they don't?

1 code implementation13 Jul 2021 Mohamed Ishmael Belghazi, David Lopez-Paz

Adding new datasets, algorithms, measures, or metrics is a matter of a few lines of code-in so hoping that UIMNET becomes a stepping stone towards realistic, rigorous, and reproducible research in uncertainty estimation.

Decision Making

Linear unit-tests for invariance discovery

2 code implementations22 Feb 2021 Benjamin Aubin, Agnieszka Słowik, Martin Arjovsky, Leon Bottou, David Lopez-Paz

There is an increasing interest in algorithms to learn invariant correlations across training environments.

Out-of-Distribution Generalization

Measuring causal influence with back-to-back regression: the linear case

no code implementations25 Sep 2019 Jean-Remi King, Francois Charton, Maxime Oquab, David Lopez-Paz

Identifying causes from observations can be particularly challenging when i) potential factors are difficult to manipulate individually and ii) observations are complex and multi-dimensional.

Causal Identification regression

Invariant Risk Minimization

15 code implementations5 Jul 2019 Martin Arjovsky, Léon Bottou, Ishaan Gulrajani, David Lopez-Paz

We introduce Invariant Risk Minimization (IRM), a learning paradigm to estimate invariant correlations across multiple training distributions.

Domain Generalization Image Classification +1

Adversarial Vulnerability of Neural Networks Increases with Input Dimension

no code implementations ICLR 2019 Carl-Johann Simon-Gabriel, Yann Ollivier, Léon Bottou, Bernhard Schölkopf, David Lopez-Paz

Over the past four years, neural networks have been proven vulnerable to adversarial images: targeted but imperceptible image perturbations lead to drastically different predictions.

Interpolation Consistency Training for Semi-Supervised Learning

4 code implementations9 Mar 2019 Vikas Verma, Kenji Kawaguchi, Alex Lamb, Juho Kannala, Arno Solin, Yoshua Bengio, David Lopez-Paz

We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm.

General Classification Semi-Supervised Image Classification

Learning about an exponential amount of conditional distributions

1 code implementation NeurIPS 2019 Mohamed Ishmael Belghazi, Maxime Oquab, Yann Lecun, David Lopez-Paz

We introduce the Neural Conditioner (NC), a self-supervised machine able to learn about all the conditional distributions of a random vector $X$.

General Classification

Single-Model Uncertainties for Deep Learning

1 code implementation NeurIPS 2019 Natasa Tagasovska, David Lopez-Paz

To estimate epistemic uncertainty, we propose Orthonormal Certificates (OCs), a collection of diverse non-constant functions that map all training samples to zero.

Prediction Intervals regression

Manifold Mixup: Better Representations by Interpolating Hidden States

12 code implementations ICLR 2019 Vikas Verma, Alex Lamb, Christopher Beckham, Amir Najafi, Ioannis Mitliagkas, Aaron Courville, David Lopez-Paz, Yoshua Bengio

Deep neural networks excel at learning the training data, but often provide incorrect and confident predictions when evaluated on slightly different test examples.

Image Classification

First-order Adversarial Vulnerability of Neural Networks and Input Dimension

1 code implementation ICLR 2019 Carl-Johann Simon-Gabriel, Yann Ollivier, Léon Bottou, Bernhard Schölkopf, David Lopez-Paz

Over the past few years, neural networks were proven vulnerable to adversarial images: targeted but imperceptible image perturbations lead to drastically different predictions.

Geometrical Insights for Implicit Generative Modeling

no code implementations21 Dec 2017 Leon Bottou, Martin Arjovsky, David Lopez-Paz, Maxime Oquab

Learning algorithms for implicit generative models can optimize a variety of criteria that measure how the data distribution differs from the implicit model distribution, including the Wasserstein distance, the Energy distance, and the Maximum Mean Discrepancy criterion.

Causal Generative Neural Networks

1 code implementation ICLR 2018 Olivier Goudet, Diviyan Kalainathan, Philippe Caillou, Isabelle Guyon, David Lopez-Paz, Michèle Sebag

We present Causal Generative Neural Networks (CGNNs) to learn functional causal models from observational data.

Causal Discovery

mixup: Beyond Empirical Risk Minimization

70 code implementations ICLR 2018 Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz

We also find that mixup reduces the memorization of corrupt labels, increases the robustness to adversarial examples, and stabilizes the training of generative adversarial networks.

Domain Generalization Memorization +2

Learning Functional Causal Models with Generative Neural Networks

2 code implementations15 Sep 2017 Olivier Goudet, Diviyan Kalainathan, Philippe Caillou, Isabelle Guyon, David Lopez-Paz, Michèle Sebag

We introduce a new approach to functional causal modeling from observational data, called Causal Generative Neural Networks (CGNN).

Optimizing the Latent Space of Generative Networks

6 code implementations ICML 2018 Piotr Bojanowski, Armand Joulin, David Lopez-Paz, Arthur Szlam

Generative Adversarial Networks (GANs) have achieved remarkable results in the task of generating realistic natural images.

Gradient Episodic Memory for Continual Learning

5 code implementations NeurIPS 2017 David Lopez-Paz, Marc'Aurelio Ranzato

One major obstacle towards AI is the poor ability of models to solve new problems quicker, and without forgetting previously acquired knowledge.

Continual Learning Incremental Learning

Causal Discovery Using Proxy Variables

no code implementations23 Feb 2017 Mateo Rojas-Carulla, Marco Baroni, David Lopez-Paz

In this paper, we develop a framework to estimate the cause-effect relation between two static entities $x$ and $y$: for instance, an art masterpiece $x$ and its fraudulent copy $y$.

Causal Discovery Relation

Patient-Driven Privacy Control through Generalized Distillation

no code implementations26 Nov 2016 Z. Berkay Celik, David Lopez-Paz, Patrick McDaniel

In this paper, we present privacy distillation, a mechanism which allows patients to control the type and amount of information they wish to disclose to the healthcare providers for use in statistical models.

Revisiting Classifier Two-Sample Tests

1 code implementation20 Oct 2016 David Lopez-Paz, Maxime Oquab

The goal of this paper is to establish the properties, performance, and uses of C2ST.

Causal Discovery Vocal Bursts Valence Prediction

From Dependence to Causation

no code implementations12 Jul 2016 David Lopez-Paz

Second, we build on this framework to interpret the problem of causal inference as the task of distribution classification, yielding a family of novel causal inference algorithms.

BIG-bench Machine Learning Causal Inference +1

Discovering Causal Signals in Images

2 code implementations CVPR 2017 David Lopez-Paz, Robert Nishihara, Soumith Chintala, Bernhard Schölkopf, Léon Bottou

Our experiments demonstrate the existence of a relation between the direction of causality and the difference between objects and their contexts, and by the same token, the existence of observable signals that reveal the causal dispositions of objects.

Causal Discovery

Minimax Lower Bounds for Realizable Transductive Classification

no code implementations9 Feb 2016 Ilya Tolstikhin, David Lopez-Paz

Transductive learning considers a training set of $m$ labeled samples and a test set of $u$ unlabeled samples, with the goal of best labeling that particular test set.

Binary Classification Classification +2

Unifying distillation and privileged information

1 code implementation11 Nov 2015 David Lopez-Paz, Léon Bottou, Bernhard Schölkopf, Vladimir Vapnik

Distillation (Hinton et al., 2015) and privileged information (Vapnik & Izmailov, 2015) are two techniques that enable machines to learn from other machines.

No Regret Bound for Extreme Bandits

no code implementations12 Aug 2015 Robert Nishihara, David Lopez-Paz, Léon Bottou

This work is naturally framed in the extreme bandit setting, which deals with sequentially choosing which distribution from a collection to sample in order to minimize (maximize) the single best cost (reward).

Hyperparameter Optimization

Non-linear Causal Inference using Gaussianity Measures

no code implementations16 Sep 2014 Daniel Hernández-Lobato, Pablo Morales-Mombiela, David Lopez-Paz, Alberto Suárez

The problem of non-linear causal inference is addressed by performing an embedding in an expanded feature space, in which the relation between causes and effects can be assumed to be linear.

Causal Inference

The Randomized Causation Coefficient

no code implementations15 Sep 2014 David Lopez-Paz, Krikamol Muandet, Benjamin Recht

We are interested in learning causal relationships between pairs of random variables, purely from observational data.

Causal Inference Feature Engineering

Randomized Nonlinear Component Analysis

no code implementations1 Feb 2014 David Lopez-Paz, Suvrit Sra, Alex Smola, Zoubin Ghahramani, Bernhard Schölkopf

Although nonlinear variants of PCA and CCA have been proposed, these are computationally prohibitive in the large scale.

Clustering

The Randomized Dependence Coefficient

no code implementations NeurIPS 2013 David Lopez-Paz, Philipp Hennig, Bernhard Schölkopf

We introduce the Randomized Dependence Coefficient (RDC), a measure of non-linear dependence between random variables of arbitrary dimension based on the Hirschfeld-Gebelein-R\'enyi Maximum Correlation Coefficient.

Cannot find the paper you are looking for? You can Submit a new open access paper.