no code implementations • 26 Sep 2024 • Evangelia Christodoulou, Annika Reinke, Rola Houhou, Piotr Kalinowski, Selen Erkan, Carole H. Sudre, Ninon Burgos, Sofiène Boutaj, Sophie Loizillon, Maëlys Solal, Nicola Rieke, Veronika Cheplygina, Michela Antonelli, Leon D. Mayer, Minu D. Tizabi, M. Jorge Cardoso, Amber Simpson, Paul F. Jäger, Annette Kopp-Schneider, Gaël Varoquaux, Olivier Colliot, Lena Maier-Hein
For more than 60% of papers, the mean performance of the second-ranked method was within the CI of the first-ranked method.
1 code implementation • 10 Sep 2024 • Lihu Chen, Gaël Varoquaux
Large Language Models (LLMs) have made significant progress in advancing artificial general intelligence (AGI), leading to the development of increasingly large models such as GPT-4 and LLaMA-405B.
no code implementations • 29 Jul 2024 • Marine Le Morvan, Gaël Varoquaux
Missing values are prevalent across various fields, posing challenges for training and deploying predictive models.
no code implementations • 20 Jun 2024 • Julie Alberge, Vincent Maladière, Olivier Grisel, Judith Abécassis, Gaël Varoquaux
When data are right-censored, i. e. some outcomes are missing due to a limited period of observation, survival analysis can compute the "time to event".
1 code implementation • 26 Feb 2024 • Myung Jun Kim, Léo Grinsztajn, Gaël Varoquaux
The architecture -- CARTE for Context Aware Representation of Table Entries -- uses a graph representation of tabular (or relational) data to process tables with different columns, string embedding of entries and columns names to model an open vocabulary, and a graph-attentional network to contextualize entries with column names and neighboring entries.
no code implementations • 7 Feb 2024 • Lihu Chen, Alexandre Perez-Lebel, Fabian M. Suchanek, Gaël Varoquaux
In this work, we construct a new evaluation dataset derived from a knowledge base to assess confidence scores given to answers of Mistral and LLaMA.
1 code implementation • 18 Jan 2024 • Lihu Chen, Gaël Varoquaux, Fabian M. Suchanek
The framework employs phrase type classification as an auxiliary task and incorporates character-level information more effectively into the phrase representation.
no code implementations • 15 Dec 2023 • Léo Grinsztajn, Edouard Oyallon, Myung Jun Kim, Gaël Varoquaux
We study the benefits of language models in 14 analytical tasks on tables while varying the training size, as well as for a fuzzy join benchmark.
1 code implementation • 19 Oct 2023 • Lihu Chen, Gaël Varoquaux, Fabian M. Suchanek
Positional Encodings (PEs) are used to inject word-order information into transformer-based language models.
1 code implementation • 3 Aug 2023 • Matthieu Doutreligne, Tristan Struja, Judith Abecassis, Claire Morgand, Leo Anthony Celi, Gaël Varoquaux
We illustrate the various choices in studying the effect of albumin on sepsis mortality in the Medical Information Mart for Intensive Care database (MIMIC-IV).
no code implementations • 3 Feb 2023 • Annika Reinke, Minu D. Tizabi, Michael Baumgartner, Matthias Eisenmann, Doreen Heckmann-Nötzel, A. Emre Kavur, Tim Rädsch, Carole H. Sudre, Laura Acion, Michela Antonelli, Tal Arbel, Spyridon Bakas, Arriel Benis, Matthew Blaschko, Florian Buettner, M. Jorge Cardoso, Veronika Cheplygina, Jianxu Chen, Evangelia Christodoulou, Beth A. Cimini, Gary S. Collins, Keyvan Farahani, Luciana Ferrer, Adrian Galdran, Bram van Ginneken, Ben Glocker, Patrick Godau, Robert Haase, Daniel A. Hashimoto, Michael M. Hoffman, Merel Huisman, Fabian Isensee, Pierre Jannin, Charles E. Kahn, Dagmar Kainmueller, Bernhard Kainz, Alexandros Karargyris, Alan Karthikesalingam, Hannes Kenngott, Jens Kleesiek, Florian Kofler, Thijs Kooi, Annette Kopp-Schneider, Michal Kozubek, Anna Kreshuk, Tahsin Kurc, Bennett A. Landman, Geert Litjens, Amin Madani, Klaus Maier-Hein, Anne L. Martel, Peter Mattson, Erik Meijering, Bjoern Menze, Karel G. M. Moons, Henning Müller, Brennan Nichyporuk, Felix Nickel, Jens Petersen, Susanne M. Rafelski, Nasir Rajpoot, Mauricio Reyes, Michael A. Riegler, Nicola Rieke, Julio Saez-Rodriguez, Clara I. Sánchez, Shravya Shetty, Maarten van Smeden, Ronald M. Summers, Abdel A. Taha, Aleksei Tiulpin, Sotirios A. Tsaftaris, Ben van Calster, Gaël Varoquaux, Manuel Wiesenfarth, Ziv R. Yaniv, Paul F. Jäger, Lena Maier-Hein
Validation metrics are key for the reliable tracking of scientific progress and for bridging the current chasm between artificial intelligence (AI) research and its translation into practice.
1 code implementation • 3 Feb 2023 • Lihu Chen, Gaël Varoquaux, Fabian M. Suchanek
Acronym Disambiguation (AD) is crucial for natural language understanding on various sources, including biomedical reports, scientific papers, and search engine queries.
no code implementations • 1 Feb 2023 • Matthieu Doutreligne, Gaël Varoquaux
But does computing these nuisance adds noise to model selection?
2 code implementations • 28 Oct 2022 • Alexandre Perez-Lebel, Marine Le Morvan, Gaël Varoquaux
Yet calibration is not enough: even a perfectly calibrated classifier with the best possible accuracy can have confidence scores that are far from the true posterior probabilities.
2 code implementations • 18 Jul 2022 • Léo Grinsztajn, Edouard Oyallon, Gaël Varoquaux
While deep learning has enabled tremendous progress on text and image datasets, its superiority on tabular data is not clear.
1 code implementation • 3 Jun 2022 • Lena Maier-Hein, Annika Reinke, Patrick Godau, Minu D. Tizabi, Florian Buettner, Evangelia Christodoulou, Ben Glocker, Fabian Isensee, Jens Kleesiek, Michal Kozubek, Mauricio Reyes, Michael A. Riegler, Manuel Wiesenfarth, A. Emre Kavur, Carole H. Sudre, Michael Baumgartner, Matthias Eisenmann, Doreen Heckmann-Nötzel, Tim Rädsch, Laura Acion, Michela Antonelli, Tal Arbel, Spyridon Bakas, Arriel Benis, Matthew Blaschko, M. Jorge Cardoso, Veronika Cheplygina, Beth A. Cimini, Gary S. Collins, Keyvan Farahani, Luciana Ferrer, Adrian Galdran, Bram van Ginneken, Robert Haase, Daniel A. Hashimoto, Michael M. Hoffman, Merel Huisman, Pierre Jannin, Charles E. Kahn, Dagmar Kainmueller, Bernhard Kainz, Alexandros Karargyris, Alan Karthikesalingam, Hannes Kenngott, Florian Kofler, Annette Kopp-Schneider, Anna Kreshuk, Tahsin Kurc, Bennett A. Landman, Geert Litjens, Amin Madani, Klaus Maier-Hein, Anne L. Martel, Peter Mattson, Erik Meijering, Bjoern Menze, Karel G. M. Moons, Henning Müller, Brennan Nichyporuk, Felix Nickel, Jens Petersen, Nasir Rajpoot, Nicola Rieke, Julio Saez-Rodriguez, Clara I. Sánchez, Shravya Shetty, Maarten van Smeden, Ronald M. Summers, Abdel A. Taha, Aleksei Tiulpin, Sotirios A. Tsaftaris, Ben van Calster, Gaël Varoquaux, Paul F. Jäger
The framework was developed in a multi-stage Delphi process and is based on the novel concept of a problem fingerprint - a structured representation of the given problem that captures all aspects that are relevant for metric selection, from the domain interest to the properties of the target structure(s), data set and algorithm output.
1 code implementation • 15 Mar 2022 • Lihu Chen, Gaël Varoquaux, Fabian M. Suchanek
State-of-the-art NLP systems represent inputs with word embeddings, but these are brittle when faced with Out-of-Vocabulary (OOV) words.
1 code implementation • 17 Feb 2022 • Alexandre Perez-Lebel, Gaël Varoquaux, Marine Le Morvan, Julie Josse, Jean-Baptiste Poline
Using gradient-boosted trees, we compare native support for missing values with simple and state-of-the-art imputation prior to learning.
no code implementations • 12 Oct 2021 • Marc-Andre Schulz, Bertrand Thirion, Alexandre Gramfort, Gaël Varoquaux, Danilo Bzdok
High-quality data accumulation is now becoming ubiquitous in the health domain.
1 code implementation • 21 Jul 2021 • Jéroôme Dockès, Gaël Varoquaux, Jean-Baptiste Poline
When a dataset shift occurs, standard machine-learning techniques do not suffice to extract and validate biomarkers.
1 code implementation • 1 Jun 2021 • Marine Le Morvan, Julie Josse, Erwan Scornet, Gaël Varoquaux
In fact, we show that on perfectly imputed data the best regression function will generally be discontinuous, which makes it hard to learn.
1 code implementation • 12 Apr 2021 • Annika Reinke, Minu D. Tizabi, Carole H. Sudre, Matthias Eisenmann, Tim Rädsch, Michael Baumgartner, Laura Acion, Michela Antonelli, Tal Arbel, Spyridon Bakas, Peter Bankhead, Arriel Benis, Matthew Blaschko, Florian Buettner, M. Jorge Cardoso, Jianxu Chen, Veronika Cheplygina, Evangelia Christodoulou, Beth Cimini, Gary S. Collins, Sandy Engelhardt, Keyvan Farahani, Luciana Ferrer, Adrian Galdran, Bram van Ginneken, Ben Glocker, Patrick Godau, Robert Haase, Fred Hamprecht, Daniel A. Hashimoto, Doreen Heckmann-Nötzel, Peter Hirsch, Michael M. Hoffman, Merel Huisman, Fabian Isensee, Pierre Jannin, Charles E. Kahn, Dagmar Kainmueller, Bernhard Kainz, Alexandros Karargyris, Alan Karthikesalingam, A. Emre Kavur, Hannes Kenngott, Jens Kleesiek, Andreas Kleppe, Sven Kohler, Florian Kofler, Annette Kopp-Schneider, Thijs Kooi, Michal Kozubek, Anna Kreshuk, Tahsin Kurc, Bennett A. Landman, Geert Litjens, Amin Madani, Klaus Maier-Hein, Anne L. Martel, Peter Mattson, Erik Meijering, Bjoern Menze, David Moher, Karel G. M. Moons, Henning Müller, Brennan Nichyporuk, Felix Nickel, M. Alican Noyan, Jens Petersen, Gorkem Polat, Susanne M. Rafelski, Nasir Rajpoot, Mauricio Reyes, Nicola Rieke, Michael Riegler, Hassan Rivaz, Julio Saez-Rodriguez, Clara I. Sánchez, Julien Schroeter, Anindo Saha, M. Alper Selver, Lalith Sharan, Shravya Shetty, Maarten van Smeden, Bram Stieltjes, Ronald M. Summers, Abdel A. Taha, Aleksei Tiulpin, Sotirios A. Tsaftaris, Ben van Calster, Gaël Varoquaux, Manuel Wiesenfarth, Ziv R. Yaniv, Paul Jäger, Lena Maier-Hein
While the importance of automatic image analysis is continuously increasing, recent meta-research revealed major flaws with respect to algorithm validation.
1 code implementation • 18 Mar 2021 • Gaël Varoquaux, Veronika Cheplygina
Finally we provide a broad range of recommendations on how to further these address problems in the future.
no code implementations • 1 Mar 2021 • Xavier Bouthillier, Pierre Delaunay, Mirko Bronzi, Assya Trofimov, Brennan Nichyporuk, Justin Szeto, Naz Sepah, Edward Raff, Kanika Madan, Vikram Voleti, Samira Ebrahimi Kahou, Vincent Michalski, Dmitriy Serdyuk, Tal Arbel, Chris Pal, Gaël Varoquaux, Pascal Vincent
Strong empirical evidence that one machine-learning algorithm A outperforms another one B ideally calls for multiple trials optimizing the learning pipeline over sources of variation such as data sampling, data augmentation, parameter initialization, and hyperparameters choices.
1 code implementation • 16 Dec 2020 • Lihu Chen, Gaël Varoquaux, Fabian M. Suchanek
Biomedical entity linking aims to map biomedical mentions, such as diseases and drugs, to standard entities in a given knowledge base.
no code implementations • 3 Jul 2020 • Marine Le Morvan, Julie Josse, Thomas Moreau, Erwan Scornet, Gaël Varoquaux
We provide an upper bound on the Bayes risk of NeuMiss networks, and show that they have good predictive accuracy with both a number of parameters and a computational complexity independent of the number of missing data patterns.
no code implementations • 5 Mar 2020 • Kamalaker Dadi, Gaël Varoquaux, Antonia Machlouzarides-Shalit, Krzysztof J. Gorgolewski, Demian Wassermann, Bertrand Thirion, Arthur Mensch
We demonstrate the benefits of extracting reduced signals on our fine-grain atlases for many classic functional data analysis pipelines: stimuli decoding from 12, 334 brain responses, standard GLM analysis of fMRI across sessions and individuals, extraction of resting-state functional-connectomes biomarkers for 2, 500 individuals, data compression and meta-analysis over more than 15, 000 statistical maps.
no code implementations • 21 Feb 2020 • Jérôme Dockès, Russell Poldrack, Romain Primet, Hande Gözükan, Tal Yarkoni, Fabian Suchanek, Bertrand Thirion, Gaël Varoquaux
Reaching a global view of brain organization requires assembling evidence on widely different mental processes and mechanisms.
1 code implementation • 3 Feb 2020 • Marine Le Morvan, Nicolas Prost, Julie Josse, Erwan Scornet, Gaël Varoquaux
In the particular Gaussian case, it can be written as a linear function of multiway interactions between the observed data and the various missing-value indicators.
no code implementations • NeurIPS Workshop Neuro_AI 2019 • Gaël Varoquaux, Kamalakar Dadi, Arthur Mensch
Here we consider atlases used to parcellate the brain when studying brain function.
1 code implementation • 3 Jul 2019 • Patricio Cerda, Gaël Varoquaux
We introduce two encoding approaches for string categories: a Gamma-Poisson matrix factorization on substring counts, and the min-hash encoder, for fast approximation of string similarities.
3 code implementations • 19 Feb 2019 • Julie Josse, Jacob M. Chen, Nicolas Prost, Erwan Scornet, Gaël Varoquaux
A striking result is that the widely-used method of imputing with a constant, such as the mean prior to learning is consistent when missing values are not informative.
1 code implementation • 17 Sep 2018 • Arthur Mensch, Julien Mairal, Bertrand Thirion, Gaël Varoquaux
Analyzing data across studies could bring more statistical power; yet the current brain-imaging analytic framework cannot be used at scale as it requires casting all cognitive tasks in a unified theoretical framework.
no code implementations • 17 Sep 2018 • Andre Manoel, Florent Krzakala, Gaël Varoquaux, Bertrand Thirion, Lenka Zdeborová
We introduce an iterative optimization scheme for convex objectives consisting of a linear loss and a non-separable penalty, based on the expectation-consistent approximation and the vector approximate message-passing (VAMP) algorithm.
2 code implementations • 4 Jun 2018 • Patricio Cerda, Gaël Varoquaux, Balázs Kégl
We show that a simple approach that exposes the redundancy to the learning algorithm brings significant gains.
no code implementations • 4 Jun 2018 • Jérôme Dockès, Demian Wassermann, Russell Poldrack, Fabian Suchanek, Bertrand Thirion, Gaël Varoquaux
In this paper, we propose to mine brain medical publications to learn the spatial distribution associated with anatomical terms.
1 code implementation • NeurIPS 2017 • Arthur Mensch, Julien Mairal, Danilo Bzdok, Bertrand Thirion, Gaël Varoquaux
Cognitive neuroscience is enjoying rapid increase in extensive public brain-imaging datasets.
1 code implementation • 23 Jun 2017 • Gaël Varoquaux
Predictive models ground many state-of-the-art developments in statistical brain image analysis: decoding, MVPA, searchlight, or extraction of biomarkers.
1 code implementation • 30 Nov 2016 • Arthur Mensch, Julien Mairal, Gaël Varoquaux, Bertrand Thirion
We present a matrix factorization algorithm that scales to input matrices that are large in both dimensions (i. e., that contains morethan 1TB of data).
no code implementations • 18 Nov 2016 • Alexandre Abraham, Michael Milham, Adriana Di Martino, R. Cameron Craddock, Dimitris Samaras, Bertrand Thirion, Gaël Varoquaux
These R-fMRI pipelines build participant-specific connectomes from functionally-defined brain areas.
1 code implementation • 15 Sep 2016 • Andrés Hoyos-Idrobo, Gaël Varoquaux, Jonas Kahn, Bertrand Thirion
Our goal is to summarize the data to decrease computational costs and memory footprint of subsequent analysis.
no code implementations • 21 Jun 2016 • Gaël Varoquaux, Matthieu Kowalski, Bertrand Thirion
Spatially-sparse predictors are good models for brain decoding: they give accurate predictions and their weight maps are interpretable as they focus on a small number of regions.
1 code implementation • 16 Jun 2016 • Gaël Varoquaux, Pradeep Reddy Raamana, Denis Engemann, Andrés Hoyos-Idrobo, Yannick Schwartz, Bertrand Thirion
Decoding, ie prediction from brain images or signals, calls for empirical evaluation of its predictive power.
1 code implementation • ICML 2017 • Eugene Belilovsky, Kyle Kastner, Gaël Varoquaux, Matthew Blaschko
Learning this function brings two benefits: it implicitly models the desired structure or sparsity properties to form suitable priors, and it can be tailored to the specific problem of edge structure discovery, rather than maximizing data likelihood.
1 code implementation • 3 May 2016 • Arthur Mensch, Julien Mairal, Bertrand Thirion, Gaël Varoquaux
Sparse matrix factorization is a popular tool to obtain interpretable data decompositions, which are also effective to perform data completion or denoising.
Ranked #13 on Recommendation Systems on MovieLens 10M
no code implementations • 8 Feb 2016 • Arthur Mensch, Gaël Varoquaux, Bertrand Thirion
We present a method for fast resting-state fMRI spatial decomposi-tions of very large datasets, based on the reduction of the temporal dimension before applying dictionary learning on concatenated individual records from groups of subjects.
no code implementations • NeurIPS 2016 • Eugene Belilovsky, Gaël Varoquaux, Matthew B. Blaschko
We characterize the uncertainty of differences with confidence intervals obtained using a parametric distribution on parameters of a sparse estimator.
no code implementations • 22 Dec 2015 • Gaël Varoquaux, Michael Eickenberg, Elvis Dohmatob, Bertand Thirion
The total variation (TV) penalty, as many other analysis-sparsity problems, does not lead to separable factors or a proximal operatorwith a closed-form expression, such as soft thresholding for the $\ell\_1$ penalty.
no code implementations • 15 Nov 2013 • Yannick Schwartz, Bertrand Thirion, Gaël Varoquaux
Imaging neuroscience links brain activation maps to behavior and cognition via correlational studies.
4 code implementations • 1 Sep 2013 • Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort, Jaques Grobler, Robert Layton, Jake Vanderplas, Arnaud Joly, Brian Holt, Gaël Varoquaux
Scikit-learn is an increasingly popular machine learning li- brary.
3 code implementations • 2 Jan 2012 • Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Andreas Müller, Joel Nothman, Gilles Louppe, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, Édouard Duchesnay
Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.