Search Results for author: Tamara Broderick

Found 51 papers, 23 papers with code

Consistent Validation for Predictive Methods in Spatial Settings

1 code implementation5 Feb 2024 David R. Burt, Yunyi Shen, Tamara Broderick

Unfortunately, classical approaches for validation fail to handle mismatch between locations available for validation and (test) locations where we want to make predictions.

Weather Forecasting

Could dropping a few cells change the takeaways from differential expression?

no code implementations11 Dec 2023 Miriam Shiffman, Ryan Giordano, Tamara Broderick

We then overcome the inherent non-differentiability of gene set enrichment analysis to develop an additional approximation for the robustness of top gene sets.

Black Box Variational Inference with a Deterministic Objective: Faster, More Accurate, and Even More Black Box

1 code implementation11 Apr 2023 Ryan Giordano, Martin Ingram, Tamara Broderick

We show on a variety of real-world problems that DADVI reliably finds good solutions with default settings (unlike ADVI) and, together with LR covariances, is typically faster and more accurate than standard ADVI.

Probabilistic Programming Stochastic Optimization +1

Gaussian processes at the Helm(holtz): A more fluid model for ocean currents

1 code implementation20 Feb 2023 Renato Berlinghieri, Brian L. Trippe, David R. Burt, Ryan Giordano, Kaushik Srinivasan, Tamay Özgökmen, Junfei Xia, Tamara Broderick

Given sparse observations of buoy velocities, oceanographers are interested in reconstructing ocean currents away from the buoys and identifying divergences in a current vector field.

Gaussian Processes

Are you using test log-likelihood correctly?

no code implementations1 Dec 2022 Sameer K. Deshpande, Soumya Ghosh, Tin D. Nguyen, Tamara Broderick

Test log-likelihood is commonly used to compare different models of the same data or different approximate inference algorithms for fitting the same probabilistic model.

Bayesian Inference Test

Diffusion probabilistic modeling of protein backbones in 3D for the motif-scaffolding problem

1 code implementation8 Jun 2022 Brian L. Trippe, Jason Yim, Doug Tischer, David Baker, Tamara Broderick, Regina Barzilay, Tommi Jaakkola

Construction of a scaffold structure that supports a desired motif, conferring protein function, shows promise for the design of vaccines and enzymes.

Many processors, little time: MCMC for partitions via optimal transport couplings

1 code implementation23 Feb 2022 Tin D. Nguyen, Brian L. Trippe, Tamara Broderick

In MCMC samplers of continuous random variables, Markov chain couplings can overcome bias.

Clustering

Toward a Taxonomy of Trust for Probabilistic Machine Learning

no code implementations5 Dec 2021 Tamara Broderick, Andrew Gelman, Rachael Meager, Anna L. Smith, Tian Zheng

Probabilistic machine learning increasingly informs critical decisions in medicine, economics, politics, and beyond.

BIG-bench Machine Learning Translation

Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression

no code implementations NeurIPS 2021 William T. Stephenson, Zachary Frangella, Madeleine Udell, Tamara Broderick

In the present paper, we show that, in the case of ridge regression, the CV loss may fail to be quasiconvex and thus may have multiple local optima.

regression

For high-dimensional hierarchical models, consider exchangeability of effects across covariates instead of across datasets

no code implementations NeurIPS 2021 Brian L. Trippe, Hilary K. Finucane, Tamara Broderick

While standard practice is to model regression parameters (effects) as (1) exchangeable across datasets and (2) correlated to differing degrees across covariates, we show that this approach exhibits poor statistical performance when the number of covariates exceeds the number of datasets.

regression

Evaluating Sensitivity to the Stick-Breaking Prior in Bayesian Nonparametrics

no code implementations8 Jul 2021 Ryan Giordano, Runjing Liu, Michael I. Jordan, Tamara Broderick

Bayesian models based on the Dirichlet process and other stick-breaking priors have been proposed as core ingredients for clustering, topic modeling, and other unsupervised learning tasks.

The SKIM-FA Kernel: High-Dimensional Variable Selection and Nonlinear Interaction Discovery in Linear Time

1 code implementation23 Jun 2021 Raj Agrawal, Tamara Broderick

Often, these effects are nonlinear and include interactions, so linear and additive methods can lead to poor estimation and variable selection.

Gaussian Processes Variable Selection

Measuring the robustness of Gaussian processes to kernel choice

no code implementations11 Jun 2021 William T. Stephenson, Soumya Ghosh, Tin D. Nguyen, Mikhail Yurochkin, Sameer K. Deshpande, Tamara Broderick

We demonstrate in both synthetic and real-world examples that decisions made with a GP can exhibit non-robustness to kernel choice, even when prior draws are qualitatively interchangeable to a user.

Gaussian Processes

Independent versus truncated finite approximations for Bayesian nonparametric inference

no code implementations NeurIPS Workshop ICBINB 2020 Tin D. Nguyen, Jonathan H. Huggins, Lorenzo Masoero, Lester Mackey, Tamara Broderick

Bayesian nonparametric models based on completely random measures (CRMs) offers flexibility when the number of clusters or latent components in a data set is unknown.

Image Denoising

Power posteriors do not reliably learn the number of components in a finite mixture

no code implementations NeurIPS Workshop ICBINB 2020 Diana Cai, Trevor Campbell, Tamara Broderick

Increasingly, though, data science papers suggest potential alternatives beyond vanilla FMMs, such as power posteriors, coarsening, and related methods.

Approximate Cross-Validation with Low-Rank Data in High Dimensions

no code implementations NeurIPS 2020 William T. Stephenson, Madeleine Udell, Tamara Broderick

Our second key insight is that, in the presence of ALR data, error in existing ACV methods roughly grows with the (approximate, low) rank rather than with the (full, high) dimension.

Vocal Bursts Intensity Prediction

Finite mixture models do not reliably learn the number of components

no code implementations8 Jul 2020 Diana Cai, Trevor Campbell, Tamara Broderick

In this paper, we add rigor to data-analysis folk wisdom by proving that under even the slightest model misspecification, the FMM component-count posterior diverges: the posterior probability of any particular finite number of components converges to 0 in the limit of infinite data.

Approximate Cross-Validation for Structured Models

1 code implementation NeurIPS 2020 Soumya Ghosh, William T. Stephenson, Tin D. Nguyen, Sameer K. Deshpande, Tamara Broderick

But this existing ACV work is restricted to simpler models by the assumptions that (i) data across CV folds are independent and (ii) an exact initial model fit is available.

Sentence

Genomic variety prediction via Bayesian nonparametrics

no code implementations pproximateinference AABI Symposium 2019 Lorenzo Masoero, Federico Camerlenghi, Stefano Favaro, Tamara Broderick

We consider the case where scientists have already conducted a pilot study to reveal some variants in a genome and are contemplating a follow-up study.

Experimental Design

Validated Variational Inference via Practical Posterior Error Bounds

1 code implementation9 Oct 2019 Jonathan H. Huggins, Mikołaj Kasprzak, Trevor Campbell, Tamara Broderick

Finally, we demonstrate the utility of our proposed workflow and error bounds on a robust regression problem and on a real-data example with a widely used multilevel hierarchical model.

Bayesian Inference Variational Inference

A Higher-Order Swiss Army Infinitesimal Jackknife

1 code implementation28 Jul 2019 Ryan Giordano, Michael. I. Jordan, Tamara Broderick

The first-order approximation is known as the "infinitesimal jackknife" in the statistics literature and has been the subject of recent interest in machine learning for approximate CV.

BIG-bench Machine Learning

Approximate Cross-Validation in High Dimensions with Guarantees

1 code implementation31 May 2019 William T. Stephenson, Tamara Broderick

Crucially, though, we are able to show, both empirically and theoretically, that one approximation can perform well in high dimensions -- in cases where the high-dimensional parameter exhibits sparsity.

Vocal Bursts Intensity Prediction

LR-GLM: High-Dimensional Bayesian Inference Using Low-Rank Data Approximations

no code implementations17 May 2019 Brian L. Trippe, Jonathan H. Huggins, Raj Agrawal, Tamara Broderick

Due to the ease of modern data collection, applied statisticians often have access to a large set of covariates that they wish to relate to some observed outcome.

Bayesian Inference Vocal Bursts Intensity Prediction

The Kernel Interaction Trick: Fast Bayesian Discovery of Pairwise Interactions in High Dimensions

1 code implementation16 May 2019 Raj Agrawal, Jonathan H. Huggins, Brian Trippe, Tamara Broderick

Discovering interaction effects on a response of interest is a fundamental problem faced in biology, medicine, economics, and many other scientific disciplines.

Uncertainty Quantification

Reconstructing probabilistic trees of cellular differentiation from single-cell RNA-seq data

no code implementations28 Nov 2018 Miriam Shiffman, William T. Stephenson, Geoffrey Schiebinger, Jonathan Huggins, Trevor Campbell, Aviv Regev, Tamara Broderick

Specifically, we extend the framework of the classical Dirichlet diffusion tree to simultaneously infer branch topology and latent cell states along continuous trajectories over the full tree.

Evaluating Sensitivity to the Stick-Breaking Prior in Bayesian Nonparametrics

4 code implementations15 Oct 2018 Runjing Liu, Ryan Giordano, Michael. I. Jordan, Tamara Broderick

Bayesian models based on the Dirichlet process and other stick-breaking priors have been proposed as core ingredients for clustering, topic modeling, and other unsupervised learning tasks.

Methodology

Data-dependent compression of random features for large-scale kernel approximation

no code implementations9 Oct 2018 Raj Agrawal, Trevor Campbell, Jonathan H. Huggins, Tamara Broderick

Random feature maps (RFMs) and the Nystrom method both consider low-rank approximations to the kernel matrix as a potential solution.

feature selection Test

Scalable Gaussian Process Inference with Finite-data Mean and Variance Guarantees

no code implementations26 Jun 2018 Jonathan H. Huggins, Trevor Campbell, Mikołaj Kasprzak, Tamara Broderick

We develop an approach to scalable approximate GP regression with finite-data guarantees on the accuracy of pointwise posterior mean and variance estimates.

Gaussian Processes regression +1

A Swiss Army Infinitesimal Jackknife

3 code implementations1 Jun 2018 Ryan Giordano, Will Stephenson, Runjing Liu, Michael. I. Jordan, Tamara Broderick

This linear approximation is sometimes known as the "infinitesimal jackknife" in the statistics literature, where it is mostly used to as a theoretical tool to prove asymptotic results.

Methodology

Minimal I-MAP MCMC for Scalable Structure Discovery in Causal DAG Models

1 code implementation ICML 2018 Raj Agrawal, Tamara Broderick, Caroline Uhler

Learning a Bayesian network (BN) from data can be useful for decision-making or discovering causal relationships.

Decision Making

Automated Scalable Bayesian Inference via Hilbert Coresets

2 code implementations13 Oct 2017 Trevor Campbell, Tamara Broderick

We begin with an intuitive reformulation of Bayesian coreset construction as sparse vector sum approximation, and demonstrate that its automation and performance-based shortcomings arise from the use of the supremum norm.

Bayesian Inference

PASS-GLM: polynomial approximate sufficient statistics for scalable Bayesian GLM inference

1 code implementation NeurIPS 2017 Jonathan H. Huggins, Ryan P. Adams, Tamara Broderick

We provide theoretical guarantees on the quality of point (MAP) estimates, the approximate posterior, and posterior mean and uncertainty estimates.

regression

Covariances, Robustness, and Variational Bayes

4 code implementations8 Sep 2017 Ryan Giordano, Tamara Broderick, Michael. I. Jordan

The estimates for MFVB posterior covariances rely on a result from the classical Bayesian robustness literature relating derivatives of posterior expectations to posterior covariances and include the Laplace approximation as a special case.

Methodology

Edge-exchangeable graphs and sparsity (NIPS 2016)

no code implementations16 Dec 2016 Diana Cai, Trevor Campbell, Tamara Broderick

Many popular network models rely on the assumption of (vertex) exchangeability, in which the distribution of the graph is invariant to relabelings of the vertices.

Boosting Variational Inference

no code implementations17 Nov 2016 Fangjian Guo, Xiangyu Wang, Kai Fan, Tamara Broderick, David B. Dunson

Variational inference (VI) provides fast approximations of a Bayesian posterior in part because it formulates posterior approximation as an optimization problem: to find the closest distribution to the exact posterior over some family of distributions.

Variational Inference

Coresets for Scalable Bayesian Logistic Regression

2 code implementations NeurIPS 2016 Jonathan H. Huggins, Trevor Campbell, Tamara Broderick

We demonstrate the efficacy of our approach on a number of synthetic and real-world datasets, and find that, in practice, the size of the coreset is independent of the original dataset size.

Bayesian Inference regression +1

Completely random measures for modeling power laws in sparse graphs

no code implementations22 Mar 2016 Diana Cai, Tamara Broderick

Since individual network datasets continue to grow in size, it is necessary to develop models that accurately represent the real-life scaling properties of networks.

Clustering

Edge-exchangeable graphs and sparsity

no code implementations NeurIPS 2016 Tamara Broderick, Diana Cai

We show that, unlike node exchangeability, edge exchangeability encompasses models that are known to provide a projective sequence of random graphs that circumvent the Aldous-Hoover Theorem and exhibit sparsity, i. e., sub-quadratic growth of the number of edges with the number of nodes.

Clustering

Covariance Matrices and Influence Scores for Mean Field Variational Bayes

no code implementations26 Feb 2015 Ryan Giordano, Tamara Broderick

We develop a fast, general methodology for exponential families that augments MFVB to deliver accurate uncertainty estimates for model variables -- both for individual variables and coherently across variables.

Covariance Matrices for Mean Field Variational Bayes

no code implementations24 Oct 2014 Ryan Giordano, Tamara Broderick

We develop a fast, general methodology for exponential families that augments MFVB to deliver accurate uncertainty estimates for model variables -- both for individual variables and coherently across variables.

Variational Bayes for Merging Noisy Databases

no code implementations17 Oct 2014 Tamara Broderick, Rebecca C. Steorts

Bayesian entity resolution merges together multiple, noisy databases and returns the minimal collection of unique individuals represented, together with their true, latent record values.

Bayesian Inference Entity Resolution

Optimistic Concurrency Control for Distributed Unsupervised Learning

no code implementations NeurIPS 2013 Xinghao Pan, Joseph E. Gonzalez, Stefanie Jegelka, Tamara Broderick, Michael. I. Jordan

Research on distributed machine learning algorithms has focused primarily on one of two extremes - algorithms that obey strict concurrency constraints or algorithms that obey few or no such constraints.

BIG-bench Machine Learning Clustering

Streaming Variational Bayes

2 code implementations NeurIPS 2013 Tamara Broderick, Nicholas Boyd, Andre Wibisono, Ashia C. Wilson, Michael. I. Jordan

We present SDA-Bayes, a framework for (S)treaming, (D)istributed, (A)synchronous computation of a Bayesian posterior.

Variational Inference

Combinatorial clustering and the beta negative binomial process

no code implementations8 Nov 2011 Tamara Broderick, Lester Mackey, John Paisley, Michael. I. Jordan

We show that the NBP is conjugate to the beta process, and we characterize the posterior distribution under the beta-negative binomial process (BNBP) and hierarchical models based on the BNBP (the HBNBP).

Clustering Image Segmentation +2

Faster solutions of the inverse pairwise Ising problem

1 code implementation14 Dec 2007 Tamara Broderick, Miroslav Dudik, Gasper Tkacik, Robert E. Schapire, William Bialek

Recent work has shown that probabilistic models based on pairwise interactions-in the simplest case, the Ising model-provide surprisingly accurate descriptions of experiments on real biological networks ranging from neurons to genes.

Cannot find the paper you are looking for? You can Submit a new open access paper.