no code implementations • 8 May 2022 • Hammaad Adam, Ming Ying Yang, Kenrick Cato, Ioana Baldini, Charles Senteio, Leo Anthony Celi, Jiaming Zeng, Moninder Singh, Marzyeh Ghassemi
In this study, we investigate the level of implicit race information available to ML models and human experts and the implications of model-detectable differences in clinical notes.
no code implementations • 6 May 2022 • Aparna Balagopalan, Haoran Zhang, Kimia Hamidieh, Thomas Hartvigsen, Frank Rudzicz, Marzyeh Ghassemi
Across two different blackbox model architectures and four popular explainability methods, we find that the approximation quality of explanation models, also known as the fidelity, differs significantly between subgroups.
1 code implementation • 23 Mar 2022 • Haoran Zhang, Natalie Dullerud, Karsten Roth, Lauren Oakden-Rayner, Stephen Robert Pfohl, Marzyeh Ghassemi
We also find that methods which achieve group fairness do so by worsening performance for all groups.
no code implementations • ICLR 2022 • Natalie Dullerud, Karsten Roth, Kimia Hamidieh, Nicolas Papernot, Marzyeh Ghassemi
Deep metric learning (DML) enables learning with less supervision through its emphasis on the similarity structure of representations.
1 code implementation • 17 Mar 2022 • Mehdi Fatemi, Mary Wu, Jeremy Petch, Walter Nelson, Stuart J. Connolly, Alexander Benz, Anthony Carnicelli, Marzyeh Ghassemi
Finally, we apply our new algorithms to a real-world offline dataset pertaining to warfarin dosing for stroke prevention and demonstrate similar results.
1 code implementation • NeurIPS 2021 • Haoran Zhang, Quaid Morris, Berk Ustun, Marzyeh Ghassemi
Our results show that our method can fit simple predictive checklists that perform well and that can easily be customized to obey a rich class of custom constraints.
no code implementations • 17 Oct 2021 • Zining Zhu, Aparna Balagopalan, Marzyeh Ghassemi, Frank Rudzicz
This framework allows us to compare across datasets, saying that, apart from a set of ``shortcut features'', classifying each sample in the Multi-NLI task involves around 0. 4 nats more TSI than in the Quora Question Pair.
1 code implementation • NeurIPS 2021 • Mehdi Fatemi, Taylor W. Killian, Jayakumar Subramanian, Marzyeh Ghassemi
Machine learning has successfully framed many sequential decision making problems as either supervised prediction, or optimal decision-making policy identification via reinforcement learning.
no code implementations • ICLR 2022 • Jimmy Ba, Murat A Erdogdu, Marzyeh Ghassemi, Shengyang Sun, Taiji Suzuki, Denny Wu, Tianzong Zhang
Stein variational gradient descent (SVGD) is a deterministic inference algorithm that evolves a set of particles to fit a target distribution.
no code implementations • ICLR 2022 • Rob Brekelmans, Sicong Huang, Marzyeh Ghassemi, Greg Ver Steeg, Roger Baker Grosse, Alireza Makhzani
Since naive importance sampling with the marginal density as a proposal requires exponential sample complexity in the true mutual information, we propose novel Multi-Sample Annealed Importance Sampling (AIS) bounds on mutual information.
1 code implementation • 27 Aug 2021 • Sindhu C. M. Gowda, Shalmali Joshi, Haoran Zhang, Marzyeh Ghassemi
This systematic investigation underlines the importance of accounting for the underlying data-generating mechanisms and fortifying data-preprocessing pipelines with a causal framework to develop methods robust to confounding biases.
1 code implementation • 27 Aug 2021 • Stephen R. Pfohl, Haoran Zhang, Yizhe Xu, Agata Foryciarz, Marzyeh Ghassemi, Nigam H. Shah
Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations, potentially introducing or reinforcing inequities in care access and quality.
no code implementations • 21 Jul 2021 • Imon Banerjee, Ananth Reddy Bhimireddy, John L. Burns, Leo Anthony Celi, Li-Ching Chen, Ramon Correa, Natalie Dullerud, Marzyeh Ghassemi, Shih-Cheng Huang, Po-Chih Kuo, Matthew P Lungren, Lyle Palmer, Brandon J Price, Saptarshi Purkayastha, Ayis Pyrros, Luke Oakden-Rayner, Chima Okechukwu, Laleh Seyyed-Kalantari, Hari Trivedi, Ryan Wang, Zachary Zaiman, Haoran Zhang, Judy W Gichoya
Methods: Using private and public datasets we evaluate: A) performance quantification of deep learning models to detect race from medical images, including the ability of these models to generalize to external environments and across multiple imaging modalities, B) assessment of possible confounding anatomic and phenotype population features, such as disease distribution and body habitus as predictors of race, and C) investigation into the underlying mechanism by which AI models can recognize race.
2 code implementations • NeurIPS 2021 • Timo Milbich, Karsten Roth, Samarth Sinha, Ludwig Schmidt, Marzyeh Ghassemi, Björn Ommer
Finally, we propose few-shot DML as an efficient way to consistently improve generalization in response to unknown test shifts presented in ooDML.
1 code implementation • 20 Mar 2021 • Haoran Zhang, Natalie Dullerud, Laleh Seyyed-Kalantari, Quaid Morris, Shalmali Joshi, Marzyeh Ghassemi
In this work, we benchmark the performance of eight domain generalization methods on multi-site clinical time series and medical imaging data.
1 code implementation • 23 Nov 2020 • Taylor W. Killian, Haoran Zhang, Jayakumar Subramanian, Mehdi Fatemi, Marzyeh Ghassemi
Reinforcement Learning (RL) has recently been applied to sequential estimation and prediction problems identifying and developing hypothetical treatment strategies for septic patients, with a particular focus on offline learning with observational data.
1 code implementation • 17 Nov 2020 • Shirly Wang, Seung Eun Yi, Shalmali Joshi, Marzyeh Ghassemi
Reliable treatment effect estimation from observational data depends on the availability of all confounding information.
no code implementations • 30 Oct 2020 • Nathan Ng, Marzyeh Ghassemi, Narendran Thangarajan, Jiacheng Pan, Qi Guo
In ablations on DBDC4 data from 2019, our semi-supervised learning methods improve the performance of a baseline BERT model by 2\% accuracy.
no code implementations • EMNLP (ClinicalNLP) 2020 • Alister D Costa, Stefan Denkovski, Michal Malyska, Sae Young Moon, Brandon Rufino, Zhen Yang, Taylor Killian, Marzyeh Ghassemi
Next, we present MSBC, a classifier that applies MS-BERT to generate embeddings and predict EDSS and functional subscores.
no code implementations • 13 Oct 2020 • Vinith M. Suriyakumar, Nicolas Papernot, Anna Goldenberg, Marzyeh Ghassemi
Our results highlight lesser-known limitations of methods for DP learning in health care, models that exhibit steep tradeoffs between privacy and utility, and models whose predictions are disproportionately influenced by large demographic groups in the training data.
no code implementations • 23 Sep 2020 • Irene Y. Chen, Shalmali Joshi, Marzyeh Ghassemi, Rajesh Ranganath
Machine learning can be used to make sense of healthcare data.
no code implementations • 22 Sep 2020 • Irene Y. Chen, Emma Pierson, Sherri Rose, Shalmali Joshi, Kadija Ferryman, Marzyeh Ghassemi
The use of machine learning (ML) in health care raises numerous ethical concerns, especially as models can amplify existing health inequities.
1 code implementation • EMNLP 2020 • Nathan Ng, Kyunghyun Cho, Marzyeh Ghassemi
Models that perform well on a training domain often fail to generalize to out-of-domain (OOD) examples.
1 code implementation • 17 Sep 2020 • Karsten Roth, Timo Milbich, Björn Ommer, Joseph Paul Cohen, Marzyeh Ghassemi
Deep Metric Learning (DML) provides a crucial tool for visual similarity and zero-shot applications by learning generalizing embedding spaces, although recent work in DML has shown strong performance saturation across training objectives.
Ranked #6 on
Metric Learning
on CARS196
(using extra training data)
1 code implementation • 20 Jul 2020 • Matthew B. A. McDermott, Bret Nestor, Evan Kim, Wancong Zhang, Anna Goldenberg, Peter Szolovits, Marzyeh Ghassemi
Multi-task learning (MTL) is a machine learning technique aiming to improve model performance by leveraging information across many tasks.
no code implementations • 30 Jun 2020 • Samarth Sinha, Karsten Roth, Anirudh Goyal, Marzyeh Ghassemi, Hugo Larochelle, Animesh Garg
Deep Neural Networks have shown great promise on a variety of downstream applications; but their ability to adapt and generalize to new data and tasks remains a challenge.
1 code implementation • 26 Jun 2020 • Matthew B. A. McDermott, Tzu Ming Harry Hsu, Wei-Hung Weng, Marzyeh Ghassemi, Peter Szolovits
CheXpert is very useful, but is relatively computationally slow, especially when integrated with end-to-end neural pipelines, is non-differentiable so can't be used in any applications that require gradients to flow through the labeler, and does not yield probabilistic outputs, which limits our ability to improve the quality of the silver labeler through techniques such as active learning.
5 code implementations • 22 Jun 2020 • Joseph Paul Cohen, Paul Morrison, Lan Dao, Karsten Roth, Tim Q Duong, Marzyeh Ghassemi
This dataset currently contains hundreds of frontal view X-rays and is the largest public resource for COVID-19 image and prognostic data, making it a necessary resource to develop and evaluate tools to aid in the treatment of COVID-19.
no code implementations • 20 Jun 2020 • Taylor W. Killian, Marzyeh Ghassemi, Shalmali Joshi
Domain shift, encountered when using a trained model for a new patient population, creates significant challenges for sequential decision making in healthcare since the target domain may be both data-scarce and confounded.
6 code implementations • 24 May 2020 • Joseph Paul Cohen, Lan Dao, Paul Morrison, Karsten Roth, Yoshua Bengio, Beiyi Shen, Almas Abbasi, Mahsa Hoshmand-Kochi, Marzyeh Ghassemi, Haifang Li, Tim Q Duong
In this study, we present a severity score prediction model for COVID-19 pneumonia for frontal chest X-ray images.
1 code implementation • 11 Mar 2020 • Haoran Zhang, Amy X. Lu, Mohamed Abdalla, Matthew McDermott, Marzyeh Ghassemi
In this work, we examine the extent to which embeddings may encode marginalized populations differently, and how this may lead to a perpetuation of biases and worsened performance on clinical tasks.
1 code implementation • 14 Feb 2020 • Laleh Seyyed-Kalantari, Guanxiong Liu, Matthew McDermott, Irene Y. Chen, Marzyeh Ghassemi
We demonstrate that TPR disparities exist in the state-of-the-art classifiers in all datasets, for all clinical tasks, and all subgroups.
Ranked #1 on
Multi-Label Classification
on ChestX-ray14
no code implementations • 4 Dec 2019 • Aparna Balagopalan, Jekaterina Novikova, Matthew B. A. McDermott, Bret Nestor, Tristan Naumann, Marzyeh Ghassemi
We learn mappings from other languages to English and detect aphasia from linguistic characteristics of speech, and show that OT domain adaptation improves aphasia detection over unilingual baselines for French (6% increased F1) and Mandarin (5% increased F1).
no code implementations • pproximateinference AABI Symposium 2019 • Jimmy Ba, Murat A. Erdogdu, Marzyeh Ghassemi, Taiji Suzuki, Shengyang Sun, Denny Wu, Tianzong Zhang
Particle-based inference algorithm is a promising method to efficiently generate samples for an intractable target distribution by iteratively updating a set of particles.
1 code implementation • 2 Aug 2019 • Bret Nestor, Matthew B. A. McDermott, Willie Boag, Gabriela Berner, Tristan Naumann, Michael C. Hughes, Anna Goldenberg, Marzyeh Ghassemi
When training clinical prediction models from electronic health records (EHRs), a key concern should be a model's ability to sustain performance over time when deployed, even as care practices, database systems, and population demographics evolve.
2 code implementations • 19 Jul 2019 • Shirly Wang, Matthew B. A. McDermott, Geeticka Chauhan, Michael C. Hughes, Tristan Naumann, Marzyeh Ghassemi
Robust machine learning relies on access to data that can be used with standardized frameworks in important tasks and the ability to develop models whose performance can be reasonably reproduced.
Ranked #3 on
Length-of-Stay prediction
on MIMIC-III
no code implementations • 2 Jul 2019 • Matthew B. A. McDermott, Shirly Wang, Nikki Marinsek, Rajesh Ranganath, Marzyeh Ghassemi, Luca Foschini
Machine learning algorithms designed to characterize, monitor, and intervene on human health (ML4H) are expected to perform safely and reliably when operating at scale, potentially outside strict human supervision.
1 code implementation • NeurIPS 2019 • Alex X. Lu, Amy X. Lu, Wiebke Schormann, Marzyeh Ghassemi, David W. Andrews, Alan M. Moses
Understanding if classifiers generalize to out-of-sample datasets is a central problem in machine learning.
1 code implementation • 4 Apr 2019 • Guanxiong Liu, Tzu-Ming Harry Hsu, Matthew McDermott, Willie Boag, Wei-Hung Weng, Peter Szolovits, Marzyeh Ghassemi
The automatic generation of radiology reports given medical radiographs has significant potential to operationally and improve clinical patient care.
no code implementations • 30 Nov 2018 • Bret Nestor, Matthew B. A. McDermott, Geeticka Chauhan, Tristan Naumann, Michael C. Hughes, Anna Goldenberg, Marzyeh Ghassemi
Machine learning for healthcare often trains models on de-identified datasets with randomly-shifted calendar dates, ignoring the fact that data were generated under hospital operation practices that change over time.
1 code implementation • 29 Nov 2018 • Aparna Balagopalan, Jekaterina Novikova, Frank Rudzicz, Marzyeh Ghassemi
We analyze the impact of age of the added samples and if they affect fairness in classification.
no code implementations • 17 Nov 2018 • Natalia Antropova, Andrew L. Beam, Brett K. Beaulieu-Jones, Irene Chen, Corey Chivers, Adrian Dalca, Sam Finlayson, Madalina Fiterau, Jason Alan Fries, Marzyeh Ghassemi, Mike Hughes, Bruno Jedynak, Jasvinder S. Kandola, Matthew McDermott, Tristan Naumann, Peter Schulam, Farah Shamout, Alexandre Yahi
This volume represents the accepted submissions from the Machine Learning for Health (ML4H) workshop at the conference on Neural Information Processing Systems (NeurIPS) 2018, held on December 8, 2018 in Montreal, Canada.
1 code implementation • 13 Oct 2018 • Marzyeh Ghassemi, Mahima Pushkarna, James Wexler, Jesse Johnson, Paul Varghese
Making decisions about what clinical tasks to prepare for is multi-factored, and especially challenging in intensive care environments where resources must be balanced with patient needs.
Human-Computer Interaction
1 code implementation • 11 Aug 2018 • Willie Boag, Harini Suresh, Leo Anthony Celi, Peter Szolovits, Marzyeh Ghassemi
There are established racial disparities in healthcare, including during end-of-life care, when poor communication and trust can lead to suboptimal outcomes for patients and their families.
Applications
1 code implementation • 30 Jun 2018 • Willie Boag, Harini Suresh, Leo Anthony Celi, Peter Szolovits, Marzyeh Ghassemi
In this work, we characterize the doctor-patient relationship using a machine learning-derived trust score.
no code implementations • 1 Jun 2018 • Marzyeh Ghassemi, Tristan Naumann, Peter Schulam, Andrew L. Beam, Irene Y. Chen, Rajesh Ranganath
Modern electronic health records (EHRs) provide data to answer clinically meaningful questions.
no code implementations • 2 Dec 2017 • Maggie Makar, Marzyeh Ghassemi, David Cutler, Ziad Obermeyer
Risk prediction is central to both clinical medicine and public health.
2 code implementations • 27 Nov 2017 • Aniruddh Raghu, Matthieu Komorowski, Imran Ahmed, Leo Celi, Peter Szolovits, Marzyeh Ghassemi
Sepsis is a leading cause of mortality in intensive care units and costs hospitals billions annually.
no code implementations • 23 May 2017 • Aniruddh Raghu, Matthieu Komorowski, Leo Anthony Celi, Peter Szolovits, Marzyeh Ghassemi
In this work, we propose a new approach to deduce optimal treatment policies for septic patients by using continuous state-space models and deep reinforcement learning.
no code implementations • 23 May 2017 • Harini Suresh, Nathan Hunt, Alistair Johnson, Leo Anthony Celi, Peter Szolovits, Marzyeh Ghassemi
Real-time prediction of clinical interventions remains a challenge within intensive care units (ICUs).
no code implementations • 20 Mar 2017 • Harini Suresh, Peter Szolovits, Marzyeh Ghassemi
We use autoencoders to create low-dimensional embeddings of underlying patient phenotypes that we hypothesize are a governing factor in determining how different patients will react to different interventions.
no code implementations • 8 Aug 2016 • Marzyeh Ghassemi, Zeeshan Syed, Daryush D. Mehta, Jarrad H. Van Stan, Robert E. Hillman, John V. Guttag
Voice disorders affect an estimated 14 million working-aged Americans, and many more worldwide.