1 code implementation • 12 Sep 2024 • Ethan Steinberg, Michael Wornow, Suhana Bedi, Jason Alan Fries, Matthew B. A. McDermott, Nigam H. Shah
The growing demand for machine learning in healthcare requires processing increasingly large electronic health record (EHR) datasets, but existing pipelines are not computationally efficient or scalable.
1 code implementation • 28 Jun 2024 • Justin Xu, Jack Gallifant, Alistair E. W. Johnson, Matthew B. A. McDermott
This library is designed to simultaneously simplify the development of task/cohorts for ML in healthcare and also enable the reproduction of these cohorts, both at an exact level for single datasets and at a conceptual level across datasets.
2 code implementations • 11 Jan 2024 • Matthew B. A. McDermott, Lasse Hyldig Hansen, Haoran Zhang, Giovanni Angelotti, Jack Gallifant
In machine learning (ML), a widespread adage is that the area under the precision-recall curve (AUPRC) is a superior metric for model comparison to the area under the receiver operating characteristic (AUROC) for binary classification tasks with class imbalance.
1 code implementation • NeurIPS 2023 • Matthew B. A. McDermott, Bret Nestor, Peniel Argaw, Isaac Kohane
Generative, pre-trained transformers (GPTs, a. k. a.
no code implementations • 30 Nov 2021 • Fabian Falck, Yuyin Zhou, Emma Rocheteau, Liyue Shen, Luis Oala, Girmaw Abebe, Subhrajit Roy, Stephen Pfohl, Emily Alsentzer, Matthew B. A. McDermott
A collection of the accepted abstracts for the Machine Learning for Health (ML4H) symposium 2021.
1 code implementation • 18 Mar 2021 • Matthew B. A. McDermott, Brendan Yap, Peter Szolovits, Marinka Zitnik
Based on this review, we introduce a descriptive framework for pre-training that allows for a granular, comprehensive understanding of how relational structure can be induced.
no code implementations • 31 Jan 2021 • Matthew B. A. McDermott, Brendan Yap, Harry Hsu, Di Jin, Peter Szolovits
Recent developments in Natural Language Processing (NLP) demonstrate that large-scale, self-supervised pre-training can be extremely beneficial for downstream tasks.
no code implementations • 19 Nov 2020 • Emily Alsentzer, Matthew B. A. McDermott, Fabian Falck, Suproteem K. Sarkar, Subhrajit Roy, Stephanie L. Hyland
A collection of the accepted abstracts for the Machine Learning for Health (ML4H) workshop at NeurIPS 2020.
1 code implementation • 20 Jul 2020 • Matthew B. A. McDermott, Bret Nestor, Evan Kim, Wancong Zhang, Anna Goldenberg, Peter Szolovits, Marzyeh Ghassemi
Multi-task learning (MTL) is a machine learning technique aiming to improve model performance by leveraging information across many tasks.
1 code implementation • 26 Jun 2020 • Matthew B. A. McDermott, Tzu Ming Harry Hsu, Wei-Hung Weng, Marzyeh Ghassemi, Peter Szolovits
CheXpert is very useful, but is relatively computationally slow, especially when integrated with end-to-end neural pipelines, is non-differentiable so can't be used in any applications that require gradients to flow through the labeler, and does not yield probabilistic outputs, which limits our ability to improve the quality of the silver labeler through techniques such as active learning.
no code implementations • 5 Feb 2020 • Matthew B. A. McDermott, Emily Alsentzer, Sam Finlayson, Michael Oberst, Fabian Falck, Tristan Naumann, Brett K. Beaulieu-Jones, Adrian V. Dalca
A collection of the accepted abstracts for the Machine Learning for Health (ML4H) workshop at NeurIPS 2019.
no code implementations • 4 Dec 2019 • Aparna Balagopalan, Jekaterina Novikova, Matthew B. A. McDermott, Bret Nestor, Tristan Naumann, Marzyeh Ghassemi
We learn mappings from other languages to English and detect aphasia from linguistic characteristics of speech, and show that OT domain adaptation improves aphasia detection over unilingual baselines for French (6% increased F1) and Mandarin (5% increased F1).
1 code implementation • 22 Nov 2019 • Samuel G. Finlayson, Matthew B. A. McDermott, Alex V. Pickering, Scott L. Lipnick, Isaac S. Kohane
Modeling the relationship between chemical structure and molecular activity is a key goal in drug development.
1 code implementation • 2 Aug 2019 • Bret Nestor, Matthew B. A. McDermott, Willie Boag, Gabriela Berner, Tristan Naumann, Michael C. Hughes, Anna Goldenberg, Marzyeh Ghassemi
When training clinical prediction models from electronic health records (EHRs), a key concern should be a model's ability to sustain performance over time when deployed, even as care practices, database systems, and population demographics evolve.
2 code implementations • 19 Jul 2019 • Shirly Wang, Matthew B. A. McDermott, Geeticka Chauhan, Michael C. Hughes, Tristan Naumann, Marzyeh Ghassemi
Robust machine learning relies on access to data that can be used with standardized frameworks in important tasks and the ability to develop models whose performance can be reasonably reproduced.
Ranked #3 on Length-of-Stay prediction on MIMIC-III
no code implementations • 2 Jul 2019 • Matthew B. A. McDermott, Shirly Wang, Nikki Marinsek, Rajesh Ranganath, Marzyeh Ghassemi, Luca Foschini
Machine learning algorithms designed to characterize, monitor, and intervene on human health (ML4H) are expected to perform safely and reliably when operating at scale, potentially outside strict human supervision.
1 code implementation • WS 2019 • Geeticka Chauhan, Matthew B. A. McDermott, Peter Szolovits
Systematic comparison of methods for relation extraction (RE) is difficult because many experiments in the field are not described precisely enough to be completely reproducible and many papers fail to report ablation studies that would highlight the relative contributions of their various combined techniques.
3 code implementations • WS 2019 • Emily Alsentzer, John R. Murphy, Willie Boag, Wei-Hung Weng, Di Jin, Tristan Naumann, Matthew B. A. McDermott
Contextual word embedding models such as ELMo (Peters et al., 2018) and BERT (Devlin et al., 2018) have dramatically improved performance for many natural language processing (NLP) tasks in recent months.
no code implementations • 30 Nov 2018 • Bret Nestor, Matthew B. A. McDermott, Geeticka Chauhan, Tristan Naumann, Michael C. Hughes, Anna Goldenberg, Marzyeh Ghassemi
Machine learning for healthcare often trains models on de-identified datasets with randomly-shifted calendar dates, ignoring the fact that data were generated under hospital operation practices that change over time.