Search Results for author: Gael Varoquaux

Found 18 papers, 8 papers with code

Imputing Out-of-Vocabulary Embeddings with LOVE Makes LanguageModels Robust with Little Cost

1 code implementation • ACL 2022 • Lihu Chen, Gael Varoquaux, Fabian Suchanek

State-of-the-art NLP systems represent inputs with word embeddings, but these are brittle when faced with Out-of-Vocabulary (OOV) words. To address this issue, we follow the principle of mimick-like models to generate vectors for unseen words, by learning the behavior of pre-trained embeddings using only the surface form of words. We present a simple contrastive learning framework, LOVE, which extends the word representation of an existing pre-trained language model (such as BERT) and makes it robust to OOV with few additional parameters. Extensive evaluations demonstrate that our lightweight model achieves similar or even better performances than prior competitors, both on original datasets and on corrupted variants.

Contrastive Learning Language Modelling +1

Paper
Code

Retrieve, Merge, Predict: Augmenting Tables with Data Lakes

1 code implementation • 9 Feb 2024 • Riccardo Cappuzzo, Gael Varoquaux, Aimee Coelho, Paolo Papotti

We present an in-depth analysis of data discovery in data lakes, focusing on table augmentation for given machine learning tasks.

Benchmarking

Paper
Code

The Past, Present, and Future of the Brain Imaging Data Structure (BIDS)

no code implementations • 11 Sep 2023 • Russell A. Poldrack, Christopher J. Markiewicz, Stefan Appelhoff, Yoni K. Ashar, Tibor Auer, Sylvain Baillet, Shashank Bansal, Leandro Beltrachini, Christian G. Benar, Giacomo Bertazzoli, Suyash Bhogawar, Ross W. Blair, Marta Bortoletto, Mathieu Boudreau, Teon L. Brooks, Vince D. Calhoun, Filippo Maria Castelli, Patricia Clement, Alexander L Cohen, Julien Cohen-Adad, Sasha D'Ambrosio, Gilles de Hollander, María de la iglesia-Vayá, Alejandro de la Vega, Arnaud Delorme, Orrin Devinsky, Dejan Draschkow, Eugene Paul Duff, Elizabeth Dupre, Eric Earl, Oscar Esteban, Franklin W. Feingold, Guillaume Flandin, anthony galassi, Giuseppe Gallitto, Melanie Ganz, Rémi Gau, James Gholam, Satrajit S. Ghosh, Alessio Giacomel, Ashley G Gillman, Padraig Gleeson, Alexandre Gramfort, Samuel Guay, Giacomo Guidali, Yaroslav O. Halchenko, Daniel A. Handwerker, Nell Hardcastle, Peer Herholz, Dora Hermes, Christopher J. Honey, Robert B. Innis, Horea-Ioan Ioanas, Andrew Jahn, Agah Karakuzu, David B. Keator, Gregory Kiar, Balint Kincses, Angela R. Laird, Jonathan C. Lau, Alberto Lazari, Jon Haitz Legarreta, Adam Li, Xiangrui Li, Bradley C. Love, Hanzhang Lu, Camille Maumet, Giacomo Mazzamuto, Steven L. Meisler, Mark Mikkelsen, Henk Mutsaerts, Thomas E. Nichols, Aki Nikolaidis, Gustav Nilsonne, Guiomar Niso, Martin Norgaard, Thomas W Okell, Robert Oostenveld, Eduard Ort, Patrick J. Park, Mateusz Pawlik, Cyril R. Pernet, Franco Pestilli, Jan Petr, Christophe Phillips, Jean-Baptiste Poline, Luca Pollonini, Pradeep Reddy Raamana, Petra Ritter, Gaia Rizzo, Kay A. Robbins, Alexander P. Rockhill, Christine Rogers, Ariel Rokem, Chris Rorden, Alexandre Routier, Jose Manuel Saborit-Torres, Taylor Salo, Michael Schirner, Robert E. Smith, Tamas Spisak, Julia Sprenger, Nicole C. Swann, Martin Szinte, Sylvain Takerkart, Bertrand Thirion, Adam G. Thomas, Sajjad Torabian, Gael Varoquaux, Bradley Voytek, Julius Welzel, Martin Wilson, Tal Yarkoni, Krzysztof J. Gorgolewski

The Brain Imaging Data Structure (BIDS) is a community-driven standard for the organization of data and metadata from a growing range of neuroscience modalities.

Paper
Add Code

Why do tree-based models still outperform deep learning on typical tabular data?

1 code implementation • NeurIPS 2022 • Leo Grinsztajn, Edouard Oyallon, Gael Varoquaux

While deep learning has enabled tremendous progress on text and image datasets, its superiority on tabular data is not clear.

Benchmarking

408

Paper
Code

What’s a good imputation to predict with missing values?

no code implementations • NeurIPS 2021 • Marine Le Morvan, Julie Josse, Erwan Scornet, Gael Varoquaux

In fact, we show that on perfectly imputed data the best regression function will generally be discontinuous, which makes it hard to learn.

Imputation regression

Paper
Add Code

AI as statistical methods for imperfect theories

no code implementations • NeurIPS Workshop AI4Scien 2021 • Gael Varoquaux

Science has progressed by reasoning on what models could not predict because they were missing important ingredients.

BIG-bench Machine Learning

Paper
Add Code

NeuMiss networks: differentiable programming for supervised learning with missing values.

no code implementations • NeurIPS 2020 • Marine Le Morvan, Julie Josses, Thomas Moreau, Erwan Scornet, Gael Varoquaux

We provide an upper bound on the Bayes risk of NeuMiss networks, and show that they have good predictive accuracy with both a number of parameters and a computational complexity independent of the number of missing data patterns.

Imputation

Paper
Add Code

Comparing distributions: \ell_1 geometry improves kernel two-sample testing

1 code implementation • NeurIPS 2019 • Meyer Scetbon, Gael Varoquaux

Here, we show that $L^p$ distances (with $p\geq 1$) between these distribution representatives give metrics on the space of distributions that are well-behaved to detect differences between distributions as they metrize the weak convergence.

Two-sample testing Vocal Bursts Valence Prediction

Paper
Code

Manifold-regression to predict from MEG/EEG brain signals without source modeling

1 code implementation • NeurIPS 2019 • David Sabbagh, Pierre Ablin, Gael Varoquaux, Alexandre Gramfort, Denis A. Engemann

We show that Wasserstein and geometric distances allow perfect out-of-sample prediction on the generative models.

EEG regression +1

Paper
Code

Computational and informatics advances for reproducible data analysis in neuroimaging

no code implementations • 24 Sep 2018 • Russell A. Poldrack, Krzysztof J. Gorgolewski, Gael Varoquaux

We argue that openness and transparency are critical for reproducibility, and we outline an ecosystem for open and transparent science that has emerged within the human neuroimaging community.

Paper
Add Code

Feature Grouping as a Stochastic Regularizer for High-Dimensional Structured Data

1 code implementation • 31 Jul 2018 • Sergul Aydore, Bertrand Thirion, Gael Varoquaux

In many applications where collecting data is expensive, for example neuroscience or medical imaging, the sample size is typically small compared to the feature dimension.

Clustering Denoising +2

Paper
Code

Stochastic Subsampling for Factorizing Huge Matrices

1 code implementation • 19 Jan 2017 • Arthur Mensch, Julien Mairal, Bertrand Thirion, Gael Varoquaux

We present a matrix-factorization algorithm that scales to input matrices with both huge number of rows and columns.

Dictionary Learning

133

Paper
Code

Learning brain regions via large-scale online structured sparse dictionary learning

no code implementations • NeurIPS 2016 • Elvis Dohmatob, Arthur Mensch, Gael Varoquaux, Bertrand Thirion

We propose a multivariate online dictionary-learning method for obtaining decompositions of brain images with structured and sparse components (aka atoms).

Dictionary Learning

Paper
Add Code

Semi-Supervised Factored Logistic Regression for High-Dimensional Neuroimaging Data

1 code implementation • NeurIPS 2015 • Danilo Bzdok, Michael Eickenberg, Olivier Grisel, Bertrand Thirion, Gael Varoquaux

Imaging neuroscience links human behavior to aspects of brain biology in ever-increasing datasets.

General Classification regression +1

Paper
Code

Fast clustering for scalable statistical analysis on structured images

no code implementations • 16 Nov 2015 • Bertrand Thirion, Andrés Hoyos-Idrobo, Jonas Kahn, Gael Varoquaux

The use of brain images as markers for diseases or behavioral differences is challenged by the small effects size and the ensuing lack of power, an issue that has incited researchers to rely more systematically on large cohorts.

Clustering Computational Efficiency +1

Paper
Add Code

Region segmentation for sparse decompositions: better brain parcellations from rest fMRI

no code implementations • 12 Dec 2014 • Alexandre Abraham, Elvis Dohmatob, Bertrand Thirion, Dimitris Samaras, Gael Varoquaux

Functional Magnetic Resonance Images acquired during resting-state provide information about the functional organization of the brain through measuring correlations between brain areas.

Paper
Add Code

Mapping paradigm ontologies to and from the brain

no code implementations • NeurIPS 2013 • Yannick Schwartz, Bertrand Thirion, Gael Varoquaux

Imaging neuroscience links brain activation maps to behavior and cognition via correlational studies.

Paper
Add Code

Brain covariance selection: better individual functional connectivity models using population prior

no code implementations • NeurIPS 2010 • Gael Varoquaux, Alexandre Gramfort, Jean-Baptiste Poline, Bertrand Thirion

We describe subject-level brain functional connectivity structure as a multivariate Gaussian process and introduce a new strategy to estimate it from group data, by imposing a common structure on the graphical model in the population.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.