Search Results for author: Daniela Witten

Found 22 papers, 9 papers with code

Revisiting inference after prediction

1 code implementation23 Jun 2023 Keshav Motwani, Daniela Witten

Recent work has focused on the very common practice of prediction-based inference: that is, (i) using a pre-trained machine learning model to predict an unobserved response variable, and then (ii) conducting inference on the association between that predicted response and some covariates.

valid

Generalized Data Thinning Using Sufficient Statistics

no code implementations22 Mar 2023 Ameer Dharamshi, Anna Neufeld, Keshav Motwani, Lucy L. Gao, Daniela Witten, Jacob Bien

A recent paper showed that for some well-known natural exponential families, $X$ can be "thinned" into independent random variables $X^{(1)}, \ldots, X^{(K)}$, such that $X = \sum_{k=1}^K X^{(k)}$.

Data thinning for convolution-closed distributions

1 code implementation18 Jan 2023 Anna Neufeld, Ameer Dharamshi, Lucy L. Gao, Daniela Witten

We propose data thinning, an approach for splitting an observation into two or more independent parts that sum to the original observation, and that follow the same distribution as the original observation, up to a (known) scaling of a parameter.

Model Selection

Inferring independent sets of Gaussian variables after thresholding correlations

no code implementations2 Nov 2022 Arkajyoti Saha, Daniela Witten, Jacob Bien

Our proposed test properly accounts for the fact that the set of variables is selected from the data, and thus is not overly conservative.

Selective Inference for Hierarchical Clustering

2 code implementations5 Dec 2020 Lucy L. Gao, Jacob Bien, Daniela Witten

Classical tests for a difference in means control the type I error rate when the groups are defined a priori.

Clustering

Testing for Association in Multi-View Network Data

1 code implementation25 Sep 2019 Lucy L. Gao, Daniela Witten, Jacob Bien

To answer this question, we extend the stochastic block model for a single network view to the two-view setting, and develop a new hypothesis test for the null hypothesis that the latent community memberships in the two data views are independent.

Stochastic Block Model

Data-Driven Discovery of Functional Cell Types that Improve Models of Neural Activity

no code implementations NeurIPS Workshop Neuro_AI 2019 Daniel Zdeblick, Eric Shea-Brown, Daniela Witten, Michael Buice

Computational neuroscience aims to fit reliable models of in vivo neural activity and interpret them as abstract computations.

Modeling microbial abundances and dysbiosis with beta-binomial regression

2 code implementations7 Feb 2019 Bryan D. Martin, Daniela Witten, Amy D. Willis

Using a sample from a population to estimate the proportion of the population with a certain category label is a broadly important problem.

Methodology

Are Clusterings of Multiple Data Views Independent?

2 code implementations12 Jan 2019 Lucy L. Gao, Jacob Bien, Daniela Witten

However, clustering the participants based on multiple data views implicitly assumes that a single underlying clustering of the participants is shared across all data views.

Clustering

Robust Sparse Reduced Rank Regression in High Dimensions

no code implementations18 Oct 2018 Kean Ming Tan, Qiang Sun, Daniela Witten

We propose robust sparse reduced rank regression for analyzing large and complex high-dimensional data with heavy-tailed random noise.

regression Vocal Bursts Intensity Prediction

Fast Nonconvex Deconvolution of Calcium Imaging Data

1 code implementation21 Feb 2018 Sean Jewell, Toby Dylan Hocking, Paul Fearnhead, Daniela Witten

Calcium imaging data promises to transform the field of neuroscience by making it possible to record from large populations of neurons simultaneously.

Methodology Neurons and Cognition Applications

In Defense of the Indefensible: A Very Naive Approach to High-Dimensional Inference

no code implementations16 May 2017 Sen Zhao, Daniela Witten, Ali Shojaie

In this paper, we consider a simple and very na\"{i}ve two-step procedure for this task, in which we (i) fit a lasso model in order to obtain a subset of the variables, and (ii) fit a least squares model on the lasso-selected set.

regression valid

Exact Spike Train Inference Via $\ell_0$ Optimization

1 code implementation25 Mar 2017 Sean Jewell, Daniela Witten

For each neuron, a fluorescence trace is measured; this can be seen as a first-order approximation of the neuron's activity over time.

Applications Neurons and Cognition

Convex Modeling of Interactions with Strong Heredity

no code implementations13 Oct 2014 Asad Haris, Daniela Witten, Noah Simon

We consider the task of fitting a regression model involving interactions among a potentially large set of covariates, in which we wish to enforce strong heredity.

Fused Lasso Additive Model

no code implementations18 Sep 2014 Ashley Petersen, Daniela Witten, Noah Simon

We consider the problem of predicting an outcome variable using $p$ covariates that are measured on $n$ independent observations, in the setting in which flexible and interpretable fits are desirable.

Sure Screening for Gaussian Graphical Models

2 code implementations29 Jul 2014 Shikai Luo, Rui Song, Daniela Witten

We propose {graphical sure screening}, or GRASS, a very simple and computationally-efficient screening procedure for recovering the structure of a Gaussian graphical model in the high-dimensional setting.

Selection Bias Correction and Effect Size Estimation under Dependence

no code implementations16 May 2014 Kean Ming Tan, Noah Simon, Daniela Witten

Many authors have proposed methods to reduce the effects of selection bias under the assumption that the naive estimates of the effect sizes are independent.

Selection bias

Learning Graphical Models With Hubs

no code implementations28 Feb 2014 Kean Ming Tan, Palma London, Karthik Mohan, Su-In Lee, Maryam Fazel, Daniela Witten

We consider the problem of learning a high-dimensional graphical model in which certain hub nodes are highly-connected to many other nodes.

Inference in High Dimensions with the Penalized Score Test

no code implementations12 Jan 2014 Arend Voorman, Ali Shojaie, Daniela Witten

Further, when an $\ell_2$ penalty is used, the test corresponds precisely to a score test in a mixed effects model, in which the effects of all but one feature are assumed to be random.

regression Variable Selection +1

The Cluster Graphical Lasso for improved estimation of Gaussian graphical models

no code implementations19 Jul 2013 Kean Ming Tan, Daniela Witten, Ali Shojaie

We begin by introducing a surprising connection between the graphical lasso and hierarchical clustering: the graphical lasso in effect performs a two-step procedure, in which (1) single linkage hierarchical clustering is performed on the variables in order to identify connected components, and then (2) an l1-penalized log likelihood is maximized on the subset of variables within each connected component.

Clustering Model Selection

Node-Based Learning of Multiple Gaussian Graphical Models

no code implementations21 Mar 2013 Karthik Mohan, Palma London, Maryam Fazel, Daniela Witten, Su-In Lee

We consider estimation under two distinct assumptions: (1) differences between the K networks are due to individual nodes that are perturbed across conditions, or (2) similarities among the K networks are due to the presence of common hub nodes that are shared across all K networks.

Structured Learning of Gaussian Graphical Models

no code implementations NeurIPS 2012 Karthik Mohan, Mike Chung, Seungyeop Han, Daniela Witten, Su-In Lee, Maryam Fazel

We consider estimation of multiple high-dimensional Gaussian graphical models corresponding to a single set of nodes under several distinct conditions.

Cannot find the paper you are looking for? You can Submit a new open access paper.