Search Results for author: Genevera I. Allen

Found 27 papers, 5 papers with code

Fair MP-BOOST: Fair and Interpretable Minipatch Boosting

no code implementations1 Apr 2024 Camille Olivia Little, Genevera I. Allen

Ensemble methods, particularly boosting, have established themselves as highly effective and widely embraced machine learning techniques for tabular data.

Fairness Feature Importance

Fair Feature Importance Scores for Interpreting Tree-Based Methods and Surrogates

no code implementations6 Oct 2023 Camille Olivia Little, Debolina Halder Lina, Genevera I. Allen

Specifically, we develop a novel fair feature importance score for trees that can be used to interpret how each feature contributes to fairness or bias in trees, tree-based ensembles, or tree-based surrogates of any complex ML system.

Fairness Feature Importance +1

Data Augmentation via Subgroup Mixup for Improving Fairness

no code implementations13 Sep 2023 Madeline Navarro, Camille Little, Genevera I. Allen, Santiago Segarra

Furthermore, our method allows us to use the generalization ability of mixup to improve both fairness and accuracy.

Data Augmentation Fairness

Interpretable Machine Learning for Discovery: Statistical Challenges \& Opportunities

no code implementations2 Aug 2023 Genevera I. Allen, Luqin Gan, Lili Zheng

In this paper, we discuss and review the field of interpretable machine learning, focusing especially on the techniques as they are often employed to generate new knowledge or make discoveries from large data sets.

Interpretable Machine Learning Model Selection +1

Nonparanormal Graph Quilting with Applications to Calcium Imaging

no code implementations22 May 2023 Andersen Chang, Lili Zheng, Gautam Dasarthy, Genevera I. Allen

Probabilistic graphical models have become an important unsupervised learning tool for detecting network structures for a variety of problems, including the estimation of functional neuronal connectivity from two-photon calcium imaging data.

Model-Agnostic Confidence Intervals for Feature Importance: A Fast and Powerful Approach Using Minipatch Ensembles

no code implementations5 Jun 2022 Luqin Gan, Lili Zheng, Genevera I. Allen

Our approach is fast as we avoid model refitting by leveraging a form of random observation and feature subsampling called minipatch ensembles; this approach also improves statistical power by avoiding data splitting.

BIG-bench Machine Learning Ensemble Learning +2

Fast and Accurate Graph Learning for Huge Data via Minipatch Ensembles

no code implementations22 Oct 2021 Tianyi Yao, Minjie Wang, Genevera I. Allen

Gaussian graphical models provide a powerful framework for uncovering conditional dependence relationships between sets of nodes; they have found applications in a wide variety of fields including sensor and communication networks, physics, finance, and computational biology.

Graph Learning Model Selection

Fast and Interpretable Consensus Clustering via Minipatch Learning

no code implementations5 Oct 2021 Luqin Gan, Genevera I. Allen

Additionally, we develop adaptive sampling schemes for observations, which result in both improved reliability and computational savings, as well as adaptive sampling schemes of features, which leads to interpretable solutions by quickly learning the most relevant features that differentiate clusters.

Clustering Computational Efficiency

Thresholded Graphical Lasso Adjusts for Latent Variables: Application to Functional Neural Connectivity

no code implementations13 Apr 2021 Minjie Wang, Genevera I. Allen

In neuroscience, researchers seek to uncover the connectivity of neurons from large-scale neural recordings or imaging; often people employ graphical model selection and estimation techniques for this purpose.

Model Selection

Simultaneous Grouping and Denoising via Sparse Convex Wavelet Clustering

no code implementations8 Dec 2020 Michael Weylandt, T. Mitchell Roddenberry, Genevera I. Allen

In contrast to common practice which denoises then clusters, our method is a unified, convex approach that performs both simultaneously.

Clustering Data Compression +1

Interpretable Visualization and Higher-Order Dimension Reduction for ECoG Data

1 code implementation15 Nov 2020 Kelly Geyer, Frederick Campbell, Andersen Chang, John Magnotti, Michael Beauchamp, Genevera I. Allen

After signal processing, this type of data may be organized as a 4-way tensor with dimensions representing trials, electrodes, frequency, and time.

Dimensionality Reduction

MP-Boost: Minipatch Boosting via Adaptive Feature and Observation Sampling

no code implementations14 Nov 2020 Mohammad Taha Toghani, Genevera I. Allen

We achieve this by developing MP-Boost, an algorithm loosely based on AdaBoost that learns by adaptively selecting small subsets of instances and features, or what we term minipatches (MP), at each iteration.

Binary Classification

Feature Selection for Huge Data via Minipatch Learning

no code implementations16 Oct 2020 Tianyi Yao, Genevera I. Allen

While feature selection is a well-studied problem with many widely-used techniques, there are typically two key challenges: i) many existing approaches become computationally intractable in huge-data settings with millions of observations and features; and ii) the statistical accuracy of selected features degrades in high-noise, high-correlation settings, thus hindering reliable model interpretation.

feature selection

Supervised Convex Clustering

1 code implementation25 May 2020 Minjie Wang, Tianyi Yao, Genevera I. Allen

Clustering has long been a popular unsupervised learning approach to identify groups of similar objects and discover patterns from unlabeled data in many applications.

Clustering

Integrative Generalized Convex Clustering Optimization and Feature Selection for Mixed Multi-View Data

no code implementations11 Dec 2019 Minjie Wang, Genevera I. Allen

While several techniques for such integrative clustering have been explored, we propose and develop a convex formalization that will inherit the strong statistical, mathematical and empirical properties of increasingly popular convex clustering methods.

Clustering feature selection

Clustered Gaussian Graphical Model via Symmetric Convex Clustering

no code implementations30 May 2019 Tianyi Yao, Genevera I. Allen

Knowledge of functional groupings of neurons can shed light on structures of neural circuits and is valuable in many types of neuroimaging studies.

Clustering

Feature Selection for Data Integration with Mixed Multi-view Data

no code implementations27 Mar 2019 Yulia Baker, Tiffany M. Tang, Genevera I. Allen

B-RAIL serves as a versatile data integration method for sparse regression and graph selection, and we demonstrate the effectiveness of B-RAIL through extensive simulations and a case study to infer the ovarian cancer gene regulatory network.

Data Integration feature selection

Dynamic Visualization and Fast Computation for Convex Clustering via Algorithmic Regularization

1 code implementation6 Jan 2019 Michael Weylandt, John Nagorski, Genevera I. Allen

Convex clustering is a promising new approach to the classical problem of clustering, combining strong performance in empirical studies with rigorous theoretical foundations.

Clustering

A Review of Multivariate Distributions for Count Data Derived from the Poisson Distribution

1 code implementation31 Aug 2016 David I. Inouye, Eunho Yang, Genevera I. Allen, Pradeep Ravikumar

The Poisson distribution has been widely studied and used for modeling univariate count-valued data.

A General Framework for Mixed Graphical Models

no code implementations2 Nov 2014 Eunho Yang, Pradeep Ravikumar, Genevera I. Allen, Yulia Baker, Ying-Wooi Wan, Zhandong Liu

"Mixed Data" comprising a large number of heterogeneous variables (e. g. count, binary, continuous, skewed continuous, among other data types) are prevalent in varied areas such as genomics and proteomics, imaging genetics, national security, social networking, and Internet advertising.

Convex Biclustering

no code implementations5 Aug 2014 Eric C. Chi, Genevera I. Allen, Richard G. Baraniuk

In the biclustering problem, we seek to simultaneously group observations and features.

Collaborative Filtering

On Poisson Graphical Models

no code implementations NeurIPS 2013 Eunho Yang, Pradeep K. Ravikumar, Genevera I. Allen, Zhandong Liu

Undirected graphical models, such as Gaussian graphical models, Ising, and multinomial/categorical graphical models, are widely used in a variety of applications for modeling distributions over a large number of variables.

valid

Conditional Random Fields via Univariate Exponential Families

no code implementations NeurIPS 2013 Eunho Yang, Pradeep K. Ravikumar, Genevera I. Allen, Zhandong Liu

We thus introduce a “novel subclass of CRFs”, derived by imposing node-wise conditional distributions of response variables conditioned on the rest of the responses and the covariates as arising from univariate exponential families.

Sparse and Functional Principal Components Analysis

1 code implementation11 Sep 2013 Genevera I. Allen, Michael Weylandt

We propose a unified approach to regularized PCA which can induce both sparsity and smoothness in both the row and column principal components.

Dimensionality Reduction EEG +1

On Graphical Models via Univariate Exponential Family Distributions

no code implementations17 Jan 2013 Eunho Yang, Pradeep Ravikumar, Genevera I. Allen, Zhandong Liu

Undirected graphical models, or Markov networks, are a popular class of statistical models, used in a wide variety of applications.

Cannot find the paper you are looking for? You can Submit a new open access paper.