Search Results for author: Marzyeh Ghassemi

Found 52 papers, 27 papers with code

Write It Like You See It: Detectable Differences in Clinical Notes By Race Lead To Differential Model Recommendations

no code implementations8 May 2022 Hammaad Adam, Ming Ying Yang, Kenrick Cato, Ioana Baldini, Charles Senteio, Leo Anthony Celi, Jiaming Zeng, Moninder Singh, Marzyeh Ghassemi

In this study, we investigate the level of implicit race information available to ML models and human experts and the implications of model-detectable differences in clinical notes.

The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations

no code implementations6 May 2022 Aparna Balagopalan, Haoran Zhang, Kimia Hamidieh, Thomas Hartvigsen, Frank Rudzicz, Marzyeh Ghassemi

Across two different blackbox model architectures and four popular explainability methods, we find that the approximation quality of explanation models, also known as the fidelity, differs significantly between subgroups.


Improving the Fairness of Chest X-ray Classifiers

1 code implementation23 Mar 2022 Haoran Zhang, Natalie Dullerud, Karsten Roth, Lauren Oakden-Rayner, Stephen Robert Pfohl, Marzyeh Ghassemi

We also find that methods which achieve group fairness do so by worsening performance for all groups.


Semi-Markov Offline Reinforcement Learning for Healthcare

1 code implementation17 Mar 2022 Mehdi Fatemi, Mary Wu, Jeremy Petch, Walter Nelson, Stuart J. Connolly, Alexander Benz, Anthony Carnicelli, Marzyeh Ghassemi

Finally, we apply our new algorithms to a real-world offline dataset pertaining to warfarin dosing for stroke prevention and demonstrate similar results.

Offline RL reinforcement-learning

Learning Optimal Predictive Checklists

1 code implementation NeurIPS 2021 Haoran Zhang, Quaid Morris, Berk Ustun, Marzyeh Ghassemi

Our results show that our method can fit simple predictive checklists that perform well and that can easily be customized to obey a rich class of custom constraints.


Quantifying the Task-Specific Information in Text-Based Classifications

no code implementations17 Oct 2021 Zining Zhu, Aparna Balagopalan, Marzyeh Ghassemi, Frank Rudzicz

This framework allows us to compare across datasets, saying that, apart from a set of ``shortcut features'', classifying each sample in the Multi-NLI task involves around 0. 4 nats more TSI than in the Quora Question Pair.

Medical Dead-ends and Learning to Identify High-risk States and Treatments

1 code implementation NeurIPS 2021 Mehdi Fatemi, Taylor W. Killian, Jayakumar Subramanian, Marzyeh Ghassemi

Machine learning has successfully framed many sequential decision making problems as either supervised prediction, or optimal decision-making policy identification via reinforcement learning.

Decision Making Frame

Understanding the Variance Collapse of SVGD in High Dimensions

no code implementations ICLR 2022 Jimmy Ba, Murat A Erdogdu, Marzyeh Ghassemi, Shengyang Sun, Taiji Suzuki, Denny Wu, Tianzong Zhang

Stein variational gradient descent (SVGD) is a deterministic inference algorithm that evolves a set of particles to fit a target distribution.

Improving Mutual Information Estimation with Annealed and Energy-Based Bounds

no code implementations ICLR 2022 Rob Brekelmans, Sicong Huang, Marzyeh Ghassemi, Greg Ver Steeg, Roger Baker Grosse, Alireza Makhzani

Since naive importance sampling with the marginal density as a proposal requires exponential sample complexity in the true mutual information, we propose novel Multi-Sample Annealed Importance Sampling (AIS) bounds on mutual information.

Mutual Information Estimation

Pulling Up by the Causal Bootstraps: Causal Data Augmentation for Pre-training Debiasing

1 code implementation27 Aug 2021 Sindhu C. M. Gowda, Shalmali Joshi, Haoran Zhang, Marzyeh Ghassemi

This systematic investigation underlines the importance of accounting for the underlying data-generating mechanisms and fortifying data-preprocessing pipelines with a causal framework to develop methods robust to confounding biases.

Data Augmentation Domain Generalization

A comparison of approaches to improve worst-case predictive model performance over patient subpopulations

1 code implementation27 Aug 2021 Stephen R. Pfohl, Haoran Zhang, Yizhe Xu, Agata Foryciarz, Marzyeh Ghassemi, Nigam H. Shah

Predictive models for clinical outcomes that are accurate on average in a patient population may underperform drastically for some subpopulations, potentially introducing or reinforcing inequities in care access and quality.

Reading Race: AI Recognises Patient's Racial Identity In Medical Images

no code implementations21 Jul 2021 Imon Banerjee, Ananth Reddy Bhimireddy, John L. Burns, Leo Anthony Celi, Li-Ching Chen, Ramon Correa, Natalie Dullerud, Marzyeh Ghassemi, Shih-Cheng Huang, Po-Chih Kuo, Matthew P Lungren, Lyle Palmer, Brandon J Price, Saptarshi Purkayastha, Ayis Pyrros, Luke Oakden-Rayner, Chima Okechukwu, Laleh Seyyed-Kalantari, Hari Trivedi, Ryan Wang, Zachary Zaiman, Haoran Zhang, Judy W Gichoya

Methods: Using private and public datasets we evaluate: A) performance quantification of deep learning models to detect race from medical images, including the ability of these models to generalize to external environments and across multiple imaging modalities, B) assessment of possible confounding anatomic and phenotype population features, such as disease distribution and body habitus as predictors of race, and C) investigation into the underlying mechanism by which AI models can recognize race.

Characterizing Generalization under Out-Of-Distribution Shifts in Deep Metric Learning

2 code implementations NeurIPS 2021 Timo Milbich, Karsten Roth, Samarth Sinha, Ludwig Schmidt, Marzyeh Ghassemi, Björn Ommer

Finally, we propose few-shot DML as an efficient way to consistently improve generalization in response to unknown test shifts presented in ooDML.

Metric Learning

An Empirical Framework for Domain Generalization in Clinical Settings

1 code implementation20 Mar 2021 Haoran Zhang, Natalie Dullerud, Laleh Seyyed-Kalantari, Quaid Morris, Shalmali Joshi, Marzyeh Ghassemi

In this work, we benchmark the performance of eight domain generalization methods on multi-site clinical time series and medical imaging data.

Domain Generalization Time Series

An Empirical Study of Representation Learning for Reinforcement Learning in Healthcare

1 code implementation23 Nov 2020 Taylor W. Killian, Haoran Zhang, Jayakumar Subramanian, Mehdi Fatemi, Marzyeh Ghassemi

Reinforcement Learning (RL) has recently been applied to sequential estimation and prediction problems identifying and developing hypothetical treatment strategies for septic patients, with a particular focus on offline learning with observational data.

reinforcement-learning Representation Learning

Confounding Feature Acquisition for Causal Effect Estimation

1 code implementation17 Nov 2020 Shirly Wang, Seung Eun Yi, Shalmali Joshi, Marzyeh Ghassemi

Reliable treatment effect estimation from observational data depends on the availability of all confounding information.

Causal Inference Frame

Improving Dialogue Breakdown Detection with Semi-Supervised Learning

no code implementations30 Oct 2020 Nathan Ng, Marzyeh Ghassemi, Narendran Thangarajan, Jiacheng Pan, Qi Guo

In ablations on DBDC4 data from 2019, our semi-supervised learning methods improve the performance of a baseline BERT model by 2\% accuracy.

Data Augmentation

Chasing Your Long Tails: Differentially Private Prediction in Health Care Settings

no code implementations13 Oct 2020 Vinith M. Suriyakumar, Nicolas Papernot, Anna Goldenberg, Marzyeh Ghassemi

Our results highlight lesser-known limitations of methods for DP learning in health care, models that exhibit steep tradeoffs between privacy and utility, and models whose predictions are disproportionately influenced by large demographic groups in the training data.

Fairness Mortality Prediction +1

Ethical Machine Learning in Health Care

no code implementations22 Sep 2020 Irene Y. Chen, Emma Pierson, Sherri Rose, Shalmali Joshi, Kadija Ferryman, Marzyeh Ghassemi

The use of machine learning (ML) in health care raises numerous ethical concerns, especially as models can amplify existing health inequities.


S2SD: Simultaneous Similarity-based Self-Distillation for Deep Metric Learning

1 code implementation17 Sep 2020 Karsten Roth, Timo Milbich, Björn Ommer, Joseph Paul Cohen, Marzyeh Ghassemi

Deep Metric Learning (DML) provides a crucial tool for visual similarity and zero-shot applications by learning generalizing embedding spaces, although recent work in DML has shown strong performance saturation across training objectives.

Ranked #6 on Metric Learning on CARS196 (using extra training data)

Knowledge Distillation Metric Learning

Uniform Priors for Data-Efficient Transfer

no code implementations30 Jun 2020 Samarth Sinha, Karsten Roth, Anirudh Goyal, Marzyeh Ghassemi, Hugo Larochelle, Animesh Garg

Deep Neural Networks have shown great promise on a variety of downstream applications; but their ability to adapt and generalize to new data and tasks remains a challenge.

Domain Adaptation Meta-Learning +1

CheXpert++: Approximating the CheXpert labeler for Speed,Differentiability, and Probabilistic Output

1 code implementation26 Jun 2020 Matthew B. A. McDermott, Tzu Ming Harry Hsu, Wei-Hung Weng, Marzyeh Ghassemi, Peter Szolovits

CheXpert is very useful, but is relatively computationally slow, especially when integrated with end-to-end neural pipelines, is non-differentiable so can't be used in any applications that require gradients to flow through the labeler, and does not yield probabilistic outputs, which limits our ability to improve the quality of the silver labeler through techniques such as active learning.

Active Learning

COVID-19 Image Data Collection: Prospective Predictions Are the Future

5 code implementations22 Jun 2020 Joseph Paul Cohen, Paul Morrison, Lan Dao, Karsten Roth, Tim Q Duong, Marzyeh Ghassemi

This dataset currently contains hundreds of frontal view X-rays and is the largest public resource for COVID-19 image and prognostic data, making it a necessary resource to develop and evaluate tools to aid in the treatment of COVID-19.

Counterfactually Guided Off-policy Transfer in Clinical Settings

no code implementations20 Jun 2020 Taylor W. Killian, Marzyeh Ghassemi, Shalmali Joshi

Domain shift, encountered when using a trained model for a new patient population, creates significant challenges for sequential decision making in healthcare since the target domain may be both data-scarce and confounded.

Decision Making

Hurtful Words: Quantifying Biases in Clinical Contextual Word Embeddings

1 code implementation11 Mar 2020 Haoran Zhang, Amy X. Lu, Mohamed Abdalla, Matthew McDermott, Marzyeh Ghassemi

In this work, we examine the extent to which embeddings may encode marginalized populations differently, and how this may lead to a perpetuation of biases and worsened performance on clinical tasks.

Fairness Word Embeddings

CheXclusion: Fairness gaps in deep chest X-ray classifiers

1 code implementation14 Feb 2020 Laleh Seyyed-Kalantari, Guanxiong Liu, Matthew McDermott, Irene Y. Chen, Marzyeh Ghassemi

We demonstrate that TPR disparities exist in the state-of-the-art classifiers in all datasets, for all clinical tasks, and all subgroups.

14 Fairness +3

Cross-Language Aphasia Detection using Optimal Transport Domain Adaptation

no code implementations4 Dec 2019 Aparna Balagopalan, Jekaterina Novikova, Matthew B. A. McDermott, Bret Nestor, Tristan Naumann, Marzyeh Ghassemi

We learn mappings from other languages to English and detect aphasia from linguistic characteristics of speech, and show that OT domain adaptation improves aphasia detection over unilingual baselines for French (6% increased F1) and Mandarin (5% increased F1).

Domain Adaptation

Towards Characterizing the High-dimensional Bias of Kernel-based Particle Inference Algorithms

no code implementations pproximateinference AABI Symposium 2019 Jimmy Ba, Murat A. Erdogdu, Marzyeh Ghassemi, Taiji Suzuki, Shengyang Sun, Denny Wu, Tianzong Zhang

Particle-based inference algorithm is a promising method to efficiently generate samples for an intractable target distribution by iteratively updating a set of particles.

Feature Robustness in Non-stationary Health Records: Caveats to Deployable Model Performance in Common Clinical Machine Learning Tasks

1 code implementation2 Aug 2019 Bret Nestor, Matthew B. A. McDermott, Willie Boag, Gabriela Berner, Tristan Naumann, Michael C. Hughes, Anna Goldenberg, Marzyeh Ghassemi

When training clinical prediction models from electronic health records (EHRs), a key concern should be a model's ability to sustain performance over time when deployed, even as care practices, database systems, and population demographics evolve.

De-identification Length-of-Stay prediction +1

MIMIC-Extract: A Data Extraction, Preprocessing, and Representation Pipeline for MIMIC-III

2 code implementations19 Jul 2019 Shirly Wang, Matthew B. A. McDermott, Geeticka Chauhan, Michael C. Hughes, Tristan Naumann, Marzyeh Ghassemi

Robust machine learning relies on access to data that can be used with standardized frameworks in important tasks and the ability to develop models whose performance can be reasonably reproduced.

Length-of-Stay prediction Outlier Detection +1

Reproducibility in Machine Learning for Health

no code implementations2 Jul 2019 Matthew B. A. McDermott, Shirly Wang, Nikki Marinsek, Rajesh Ranganath, Marzyeh Ghassemi, Luca Foschini

Machine learning algorithms designed to characterize, monitor, and intervene on human health (ML4H) are expected to perform safely and reliably when operating at scale, potentially outside strict human supervision.

Clinically Accurate Chest X-Ray Report Generation

1 code implementation4 Apr 2019 Guanxiong Liu, Tzu-Ming Harry Hsu, Matthew McDermott, Willie Boag, Wei-Hung Weng, Peter Szolovits, Marzyeh Ghassemi

The automatic generation of radiology reports given medical radiographs has significant potential to operationally and improve clinical patient care.

Text Generation

Rethinking clinical prediction: Why machine learning must consider year of care and feature aggregation

no code implementations30 Nov 2018 Bret Nestor, Matthew B. A. McDermott, Geeticka Chauhan, Tristan Naumann, Michael C. Hughes, Anna Goldenberg, Marzyeh Ghassemi

Machine learning for healthcare often trains models on de-identified datasets with randomly-shifted calendar dates, ignoring the fact that data were generated under hospital operation practices that change over time.

Mortality Prediction

Machine Learning for Health (ML4H) Workshop at NeurIPS 2018

no code implementations17 Nov 2018 Natalia Antropova, Andrew L. Beam, Brett K. Beaulieu-Jones, Irene Chen, Corey Chivers, Adrian Dalca, Sam Finlayson, Madalina Fiterau, Jason Alan Fries, Marzyeh Ghassemi, Mike Hughes, Bruno Jedynak, Jasvinder S. Kandola, Matthew McDermott, Tristan Naumann, Peter Schulam, Farah Shamout, Alexandre Yahi

This volume represents the accepted submissions from the Machine Learning for Health (ML4H) workshop at the conference on Neural Information Processing Systems (NeurIPS) 2018, held on December 8, 2018 in Montreal, Canada.

ClinicalVis: Supporting Clinical Task-Focused Design Evaluation

1 code implementation13 Oct 2018 Marzyeh Ghassemi, Mahima Pushkarna, James Wexler, Jesse Johnson, Paul Varghese

Making decisions about what clinical tasks to prepare for is multi-factored, and especially challenging in intensive care environments where resources must be balanced with patient needs.

Human-Computer Interaction

Racial Disparities and Mistrust in End-of-Life Care

1 code implementation11 Aug 2018 Willie Boag, Harini Suresh, Leo Anthony Celi, Peter Szolovits, Marzyeh Ghassemi

There are established racial disparities in healthcare, including during end-of-life care, when poor communication and trust can lead to suboptimal outcomes for patients and their families.


Modeling Mistrust in End-of-Life Care

1 code implementation30 Jun 2018 Willie Boag, Harini Suresh, Leo Anthony Celi, Peter Szolovits, Marzyeh Ghassemi

In this work, we characterize the doctor-patient relationship using a machine learning-derived trust score.

Sentiment Analysis

Continuous State-Space Models for Optimal Sepsis Treatment - a Deep Reinforcement Learning Approach

no code implementations23 May 2017 Aniruddh Raghu, Matthieu Komorowski, Leo Anthony Celi, Peter Szolovits, Marzyeh Ghassemi

In this work, we propose a new approach to deduce optimal treatment policies for septic patients by using continuous state-space models and deep reinforcement learning.

Decision Making reinforcement-learning

Clinical Intervention Prediction and Understanding using Deep Networks

no code implementations23 May 2017 Harini Suresh, Nathan Hunt, Alistair Johnson, Leo Anthony Celi, Peter Szolovits, Marzyeh Ghassemi

Real-time prediction of clinical interventions remains a challenge within intensive care units (ICUs).

The Use of Autoencoders for Discovering Patient Phenotypes

no code implementations20 Mar 2017 Harini Suresh, Peter Szolovits, Marzyeh Ghassemi

We use autoencoders to create low-dimensional embeddings of underlying patient phenotypes that we hypothesize are a governing factor in determining how different patients will react to different interventions.

Uncovering Voice Misuse Using Symbolic Mismatch

no code implementations8 Aug 2016 Marzyeh Ghassemi, Zeeshan Syed, Daryush D. Mehta, Jarrad H. Van Stan, Robert E. Hillman, John V. Guttag

Voice disorders affect an estimated 14 million working-aged Americans, and many more worldwide.


Cannot find the paper you are looking for? You can Submit a new open access paper.