no code implementations • 6 Mar 2024 • Yibo Jiang, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam, Victor Veitch
To that end, we introduce a simple latent variable model to abstract and formalize the concept dynamics of the next token prediction.
no code implementations • 14 Feb 2024 • Goutham Rajendran, Simon Buchholz, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar
In this work, we relate these two approaches and study how to learn human-interpretable concepts from data.
1 code implementation • 9 Feb 2024 • Yuhao Wang, Ming Gao, Wai Ming Tai, Bryon Aragam, Arnab Bhattacharyya
We develop optimal algorithms for learning undirected Gaussian trees and directed Gaussian polytrees from data.
no code implementations • 28 Dec 2023 • Zhao Lyu, Wai Ming Tai, Mladen Kolar, Bryon Aragam
In this paper, we highlight the inherent limitations of cross-validation when employed to discern the structure of a Gaussian graphical model.
1 code implementation • NeurIPS 2023 • Tianyu Chen, Kevin Bello, Bryon Aragam, Pradeep Ravikumar
Structural causal models (SCMs) are widely used in various disciplines to represent causal relationships among variables in complex systems.
no code implementations • NeurIPS 2023 • Simon Buchholz, Goutham Rajendran, Elan Rosenfeld, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar
We study the problem of learning causal representations from unknown, latent interventions in a general setting, where the latent distribution is Gaussian but the mixing function is completely general.
no code implementations • 31 May 2023 • Alex Markham, MingYu Liu, Bryon Aragam, Liam Solus
Factor analysis (FA) is a statistical tool for studying how observed variables with some mutual dependences can be expressed as functions of mutually independent unobserved factors, and it is widely applied throughout the psychological, biological, and physical sciences.
1 code implementation • 26 May 2023 • Chang Deng, Kevin Bello, Bryon Aragam, Pradeep Ravikumar
In this work, we delve into the optimization challenges associated with this class of non-convex programs.
no code implementations • 6 May 2023 • Wai Ming Tai, Bryon Aragam
We study the problem of learning mixtures of Gaussians with censored data.
3 code implementations • 16 Sep 2022 • Kevin Bello, Bryon Aragam, Pradeep Ravikumar
From the optimization side, we drop the typically used augmented Lagrangian scheme and propose DAGMA ($\textit{DAGs via M-matrices for Acyclicity}$), a method that resembles the central path for barrier methods.
no code implementations • 20 Jun 2022 • Bohdan Kivva, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam
We prove identifiability of a broad class of deep latent variable models that (a) have universal approximation capabilities and (b) are the decoders of variational autoencoders that are commonly used in practice.
no code implementations • 12 Jun 2022 • Arash A. Amini, Bryon Aragam, Qing Zhou
We introduce and study the neighbourhood lattice decomposition of a distribution, which is a compact, non-graphical representation of conditional independence that is valid in the absence of a faithful graphical representation.
no code implementations • 28 Mar 2022 • Bryon Aragam, Wai Ming Tai
Combining these bounds, we conclude that the optimal sample complexity of this problem properly lies in between polynomial and exponential, which is not common in learning theory.
1 code implementation • 25 Jan 2022 • Ming Gao, Wai Ming Tai, Bryon Aragam
In other words, at least for Gaussian models with equal error variances, learning a directed graphical model is statistically no more difficult than learning an undirected graphical model.
no code implementations • 5 Nov 2021 • Haohan Wang, Bryon Aragam, Eric Xing
Motivated by empirical arguments that are well-known from the genome-wide association studies (GWAS) literature, we study the statistical properties of linear mixed models (LMMs) applied to GWAS.
1 code implementation • 1 Nov 2021 • Ben Lengerich, Caleb Ellington, Bryon Aragam, Eric P. Xing, Manolis Kellis
We encode the acyclicity constraint as a smooth regularization loss which is back-propagated to the mixing function; in this way, NOTMAD shares information between context-specific acyclic graphs, enabling the estimation of Bayesian network structures and parameters at even single-sample resolution.
1 code implementation • NeurIPS 2021 • Ming Gao, Bryon Aragam
Perhaps surprisingly, we show that for certain graph ensembles, a simple forward greedy search algorithm (i. e. without a backward pruning phase) suffices to learn the Markov boundary of each node.
no code implementations • NeurIPS 2021 • Goutham Rajendran, Bohdan Kivva, Ming Gao, Bryon Aragam
Greedy algorithms have long been a workhorse for learning graphical models, and more broadly for learning statistical models with sparse structure.
no code implementations • 31 Aug 2021 • Bryon Aragam, Ruiyi Yang
We study uniform consistency in nonparametric mixture models as well as closely related mixture of regression (also known as mixed regression) models, where the regression functions are allowed to be nonparametric and the error distributions are assumed to be convolutions of a Gaussian density.
1 code implementation • NeurIPS 2021 • Bohdan Kivva, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam
We study the problem of reconstructing a causal graphical model from data in the presence of latent variables.
no code implementations • NeurIPS 2023 • Han Zhao, Chen Dan, Bryon Aragam, Tommi S. Jaakkola, Geoffrey J. Gordon, Pradeep Ravikumar
A wide range of machine learning applications such as privacy-preserving learning, algorithmic fairness, and domain adaptation/generalization among others, involve learning invariant representations of the data that aim to achieve two competing goals: (a) maximize information or accuracy with respect to a target response, and (b) maximize invariance or independence with respect to a set of protected features (e. g., for fairness, privacy, etc).
1 code implementation • NeurIPS 2020 • Ming Gao, Yi Ding, Bryon Aragam
We establish finite-sample guarantees for a polynomial-time algorithm for learning a nonlinear, nonparametric directed acyclic graphical (DAG) model from data.
4 code implementations • 2 Feb 2020 • Roxana Pamfil, Nisara Sriwattanaworachai, Shaan Desai, Philip Pilgerstorfer, Paul Beaumont, Konstantinos Georgatzis, Bryon Aragam
Compared to state-of-the-art methods for learning dynamic Bayesian networks, our method is both scalable and accurate on real data.
2 code implementations • 2 Dec 2019 • David I. Inouye, Liu Leqi, Joon Sik Kim, Bryon Aragam, Pradeep Ravikumar
To address these drawbacks, we formalize a method for automating the selection of interesting PDPs and extend PDPs beyond showing single features to show the model response along arbitrary directions, for example in raw feature space or a latent space arising from some generative model.
no code implementations • NeurIPS 2019 • Bryon Aragam, Arash Amini, Qing Zhou
We prove that $\Omega(s\log p)$ samples suffice to learn a sparse Gaussian directed acyclic graph (DAG) from data, where $s$ is the maximum Markov blanket size.
1 code implementation • NeurIPS 2019 • Benjamin Lengerich, Bryon Aragam, Eric P. Xing
Modern applications of machine learning (ML) deal with increasingly heterogeneous datasets comprised of data collected from overlapping latent subpopulations.
2 code implementations • 29 Sep 2019 • Xun Zheng, Chen Dan, Bryon Aragam, Pradeep Ravikumar, Eric P. Xing
We develop a framework for learning sparse nonparametric directed acyclic graphs (DAGs) from data.
no code implementations • 3 Sep 2019 • Arash A. Amini, Bryon Aragam, Qing Zhou
Knowing when a graphical model is perfect to a distribution is essential in order to relate separation in the graph to conditional independence in the distribution, and this is particularly important when performing inference from data.
no code implementations • NeurIPS 2018 • Chen Dan, Liu Leqi, Bryon Aragam, Pradeep K. Ravikumar, Eric P. Xing
We study the sample complexity of semi-supervised learning (SSL) and introduce new assumptions based on the mismatch between a mixture model learned from unlabeled data and the true mixture model induced by the (unknown) class conditional distributions.
no code implementations • 17 Oct 2018 • Aurick Qiao, Bryon Aragam, Bingjing Zhang, Eric P. Xing
In this paper, we develop a general framework to quantify the effects of calculation errors on iterative-convergent algorithms and use this framework to design new strategies for checkpoint-based fault tolerance.
no code implementations • NeurIPS 2018 • Chen Dan, Liu Leqi, Bryon Aragam, Pradeep Ravikumar, Eric P. Xing
We study the sample complexity of semi-supervised learning (SSL) and introduce new assumptions based on the mismatch between a mixture model learned from unlabeled data and the true mixture model induced by the (unknown) class conditional distributions.
4 code implementations • NeurIPS 2018 • Xun Zheng, Bryon Aragam, Pradeep Ravikumar, Eric P. Xing
This is achieved by a novel characterization of acyclicity that is not only smooth but also exact.
no code implementations • 12 Feb 2018 • Bryon Aragam, Chen Dan, Eric P. Xing, Pradeep Ravikumar
Motivated by problems in data clustering, we establish general conditions under which families of nonparametric mixture models are identifiable, by introducing a novel framework involving clustering overfitted \emph{parametric} (i. e. misspecified) mixture models.
1 code implementation • 3 Nov 2017 • Arash A. Amini, Bryon Aragam, Qing Zhou
We study the computational complexity of computing these structures and show that under a sparsity assumption, they can be computed in polynomial time, even in the absence of the assumption of perfectness to a graph.
2 code implementations • 11 Mar 2017 • Bryon Aragam, Jiaying Gu, Qing Zhou
To meet this challenge, we have developed a new R package called sparsebn for learning the structure of large, sparse graphical models with a focus on Bayesian networks.
1 code implementation • 29 Nov 2015 • Bryon Aragam, Arash A. Amini, Qing Zhou
We study a family of regularized score-based estimators for learning the structure of a directed acyclic graph (DAG) for a multivariate normal distribution from high-dimensional data with $p\gg n$.
no code implementations • 4 Jan 2014 • Bryon Aragam, Qing Zhou
We develop a penalized likelihood estimation framework to estimate the structure of Gaussian Bayesian networks from observational data.