no code implementations • 22 Feb 2024 • Jean Feng, Harvineet Singh, Fan Xia, Adarsh Subbaswamy, Alexej Gossmann
Machine learning (ML) algorithms can often differ in performance across domains.
1 code implementation • 7 Dec 2023 • Harvineet Singh, Fan Xia, Mi-Ok Kim, Romain Pirracchio, Rumi Chunara, Jean Feng
In fairness audits, a standard objective is to detect whether a given algorithm performs substantially differently between subgroups.
no code implementations • 20 Nov 2023 • Jean Feng, Adarsh Subbaswamy, Alexej Gossmann, Harvineet Singh, Berkman Sahiner, Mi-Ok Kim, Gene Pennello, Nicholas Petrick, Romain Pirracchio, Fan Xia
When an ML algorithm interacts with its environment, the algorithm can affect the data-generating mechanism and be a major source of bias when evaluating its standalone performance, an issue known as performativity.
1 code implementation • 28 Jul 2023 • Jean Feng, Alexej Gossmann, Romain Pirracchio, Nicholas Petrick, Gene Pennello, Berkman Sahiner
In a well-calibrated risk prediction model, the average predicted probability is close to the true event rate for any given subgroup.
1 code implementation • 17 Nov 2022 • Jean Feng, Alexej Gossmann, Gene Pennello, Nicholas Petrick, Berkman Sahiner, Romain Pirracchio
Performance monitoring of machine learning (ML)-based risk prediction models in healthcare is complicated by the issue of confounding medical interventions (CMI): when an algorithm predicts a patient to be at high risk for an adverse event, clinicians are more likely to administer prophylactic treatment and alter the very target that the algorithm aims to predict.
no code implementations • 21 Mar 2022 • Jean Feng, Gene Pennello, Nicholas Petrick, Berkman Sahiner, Romain Pirracchio, Alexej Gossmann
Each modification introduces a risk of deteriorating performance and must be validated on a test dataset.
1 code implementation • 13 Oct 2021 • Jean Feng, Alexej Gossmann, Berkman Sahiner, Romain Pirracchio
In the COPD study, BLR and MarBLR dynamically combined the original model with a continually-refitted gradient boosted tree to achieve aAUCs of 0. 924 (95%CI 0. 913-0. 935) and 0. 925 (95%CI 0. 914-0. 935), compared to the static model's aAUC of 0. 904 (95%CI 0. 892-0. 916).
no code implementations • 14 Dec 2020 • Jean Feng
Machine learning algorithms in healthcare have the potential to continually learn from real-world data generated during healthcare delivery and adapt to dataset shifts.
3 code implementations • ICML 2020 • Brian D. Williamson, Jean Feng
The true population-level importance of a variable in a prediction task provides useful knowledge about the underlying data-generating mechanism and can help in deciding which measurements to collect in subsequent experiments.
3 code implementations • 11 May 2020 • Jean Feng, Noah Simon
Neural networks have seen limited use in prediction for high-dimensional data with small sample sizes, because they tend to overfit and require tuning many more hyperparameters than existing off-the-shelf machine learning methods.
1 code implementation • 28 Dec 2019 • Jean Feng, Scott Emerson, Noah Simon
Successful deployment of machine learning algorithms in healthcare requires careful assessments of their performance and safety.
1 code implementation • 13 Jun 2019 • Jean Feng, Arjun Sondhi, Jessica Perry, Noah Simon
Though black-box predictors are state-of-the-art for many complex tasks, they often fail to properly quantify predictive uncertainty and may provide inappropriate predictions for unfamiliar data.
no code implementations • 28 Mar 2019 • Jean Feng, Noah Simon
We establish that the fitted models are Lipschitz in the penalty parameters and thus our oracle inequalities apply.
1 code implementation • ICML 2018 • Jean Feng, Brian Williamson, Noah Simon, Marco Carone
In predictive modeling applications, it is often of interest to determine the relative contribution of subsets of features in explaining the variability of an outcome.
1 code implementation • 21 Nov 2017 • Jean Feng, Noah Simon
In addition, we characterize the statistical convergence of the penalized empirical risk minimizer to the optimal neural network: we show that the excess risk of this penalized estimator only grows with the logarithm of the number of input features; and we show that the weights of irrelevant features converge to zero.
no code implementations • 28 Mar 2017 • Jean Feng, Noah Simon
It is more efficient to tune parameters if the gradient can be determined, but this is often difficult for problems with non-smooth penalty functions.