no code implementations • 29 Sep 2024 • Yao Zhang, Emmanuel J. Candès
This article introduces a new method, posterior conformal prediction (PCP), which generates prediction intervals with both marginal and approximate conditional validity for clusters (or subgroups) naturally discovered in the data.
1 code implementation • 15 Sep 2024 • Parth T. Nobel, Daniel LeJeune, Emmanuel J. Candès
Estimating out-of-sample risk for models trained on large high-dimensional datasets is an expensive but essential part of the machine learning process, enabling practitioners to optimally tune hyperparameters.
1 code implementation • 27 Aug 2024 • Kristina Gligorić, Tijana Zrnic, Cinoo Lee, Emmanuel J. Candès, Dan Jurafsky
We introduce Confidence-Driven Inference: a method that combines LLM annotations and LLM confidence indicators to strategically select which human annotations should be collected, with the goal of producing accurate statistical estimates and provably valid confidence intervals while reducing the number of human annotations needed.
2 code implementations • 14 Jun 2024 • John J. Cherian, Isaac Gibbs, Emmanuel J. Candès
These methods work by filtering claims from the LLM's original response if a scoring function evaluated on the claim fails to exceed a threshold calibrated via split conformal prediction.
1 code implementation • 11 Jun 2024 • Ran Xie, Rina Foygel Barber, Emmanuel J. Candès
This paper introduces a boosted conformal procedure designed to tailor conformalized prediction intervals toward specific desired properties, such as enhanced conditional coverage or reduced interval length.
1 code implementation • 5 Mar 2024 • Tijana Zrnic, Emmanuel J. Candès
This means that for the same number of collected samples, active inference enables smaller confidence intervals and more powerful p-values.
2 code implementations • 28 Sep 2023 • Tijana Zrnic, Emmanuel J. Candès
We show that cross-prediction is consistently more powerful than an adaptation of prediction-powered inference in which a fraction of the labeled data is split off and used to train the model.
1 code implementation • 5 May 2023 • John J. Cherian, Emmanuel J. Candès
Our methods can be used to flag subpopulations affected by model underperformance, and certify subpopulations for which the model performs adequately.
2 code implementations • 4 Oct 2022 • Ying Jin, Emmanuel J. Candès
Decision making or scientific discovery pipelines such as job hiring and drug discovery often involve multiple stages: before any resource-intensive step, there is often an initial screening that uses predictions from a machine learning model to shortlist a few candidates from a large pool.
1 code implementation • 3 Oct 2021 • Anastasios N. Angelopoulos, Stephen Bates, Emmanuel J. Candès, Michael I. Jordan, Lihua Lei
We introduce a framework for calibrating machine learning models so that their predictions satisfy explicit, finite-sample statistical guarantees.
2 code implementations • 17 Mar 2021 • Emmanuel J. Candès, Lihua Lei, Zhimei Ren
Existing survival analysis techniques heavily rely on strong modelling assumptions and are, therefore, prone to model misspecification errors.
2 code implementations • 11 Jun 2020 • Lihua Lei, Emmanuel J. Candès
At the moment, much emphasis is placed on the estimation of the conditional average treatment effect via flexible machine learning algorithms.
1 code implementation • 8 Jun 2020 • Charmaine Chia, Matteo Sesia, Chi-Sing Ho, Stefanie S. Jeffrey, Jennifer Dionne, Emmanuel J. Candès, Roger T. Howe
Deep neural networks and other sophisticated machine learning models are widely applied to biomedical signal data because they can detect complex patterns and compute accurate predictions.
1 code implementation • NeurIPS 2020 • Yaniv Romano, Stephen Bates, Emmanuel J. Candès
We present a flexible framework for learning predictive models that approximately satisfy the equalized odds notion of fairness.
2 code implementations • NeurIPS 2020 • Yaniv Romano, Matteo Sesia, Emmanuel J. Candès
Conformal inference, cross-validation+, and the jackknife+ are hold-out methods that can be combined with virtually any machine learning algorithm to construct prediction sets with guaranteed marginal coverage.
1 code implementation • 12 Sep 2019 • Matteo Sesia, Emmanuel J. Candès
We compare two recently proposed methods that combine ideas from conformal inference and quantile regression to produce locally adaptive and marginally valid prediction intervals under sample exchangeability (Romano et al., 2019; Kivaranovic et al., 2019).
1 code implementation • 15 Aug 2019 • Yaniv Romano, Rina Foygel Barber, Chiara Sabatti, Emmanuel J. Candès
An important factor to guarantee a fair use of data-driven recommendation systems is that we should be able to communicate their uncertainty to decision makers.
5 code implementations • NeurIPS 2019 • Yaniv Romano, Evan Patterson, Emmanuel J. Candès
Conformal prediction is a technique for constructing prediction intervals that attain valid coverage in finite samples, without making distributional assumptions.
no code implementations • 12 Mar 2019 • Rina Foygel Barber, Emmanuel J. Candès, Aaditya Ramdas, Ryan J. Tibshirani
We consider the problem of distribution-free predictive inference, with the goal of producing predictive coverage guarantees that hold conditionally rather than marginally.
Statistics Theory Statistics Theory
no code implementations • 7 Feb 2019 • David A. Barmherzig, Ju Sun, Emmanuel J. Candès, T. J. Lane, Po-Nan Li
A new reference design is introduced for holographic coherent diffraction imaging.
4 code implementations • 16 Nov 2018 • Yaniv Romano, Matteo Sesia, Emmanuel J. Candès
This paper introduces a machine for sampling approximate model-X knockoffs for arbitrary and unspecified data distributions using deep generative models.
no code implementations • 5 Jun 2017 • Pragya Sur, Yuxin Chen, Emmanuel J. Candès
When used for the purpose of statistical inference, logistic models produce p-values for the regression coefficients by using an approximation to the distribution of the likelihood-ratio test.
no code implementations • 14 Jul 2014 • Małgorzata Bogdan, Ewout van den Berg, Chiara Sabatti, Weijie Su, Emmanuel J. Candès
SLOPE, short for Sorted L-One Penalized Estimation, is the solution to \[\min_{b\in\mathbb{R}^p}\frac{1}{2}\Vert y-Xb\Vert _{\ell_2}^2+\lambda_1\vert b\vert _{(1)}+\lambda_2\vert b\vert_{(2)}+\cdots+\lambda_p\vert b\vert_{(p)},\] where $\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_p\ge0$ and $\vert b\vert_{(1)}\ge\vert b\vert_{(2)}\ge\cdots\ge\vert b\vert_{(p)}$ are the decreasing absolute values of the entries of $b$.
Methodology
no code implementations • 11 Jan 2013 • Mahdi Soltanolkotabi, Ehsan Elhamifar, Emmanuel J. Candès
Subspace clustering refers to the task of finding a multi-subspace representation that best fits a collection of points taken from a high-dimensional space.