no code implementations • ICML 2020 • Xinjie Fan, Yuguang Yue, Purnamrita Sarkar, Y. X. Rachel Wang
Tuning hyperparameters for unsupervised learning problems is difficult in general due to the lack of ground truth for validation.
no code implementations • 16 May 2022 • Nhat Ho, Tongzheng Ren, Sujay Sanghavi, Purnamrita Sarkar, Rachel Ward
Therefore, the total computational complexity of the EGD algorithm is \emph{optimal} and exponentially cheaper than that of the GD for solving parameter estimation in non-regular statistical models while being comparable to that of the GD in regular statistical settings.
no code implementations • NeurIPS 2021 • Robert Lunde, Purnamrita Sarkar, Rachel Ward
We consider the problem of quantifying uncertainty for the estimation error of the leading eigenvector from Oja's algorithm for streaming principal component analysis, where the data are generated IID from some unknown distribution.
no code implementations • 14 Sep 2020 • Qiaohui Lin, Robert Lunde, Purnamrita Sarkar
We propose a new class of multiplier bootstraps for count functionals, ranging from a fast, approximate linear bootstrap tailored to sparse, massive graphs to a quadratic bootstrap procedure that offers refined accuracy for smaller, denser graphs.
no code implementations • ICML 2020 • Qiaohui Lin, Robert Lunde, Purnamrita Sarkar
We study the properties of a leave-node-out jackknife procedure for network data.
no code implementations • 16 Dec 2019 • Prateek R. Srivastava, Purnamrita Sarkar, Grani A. Hanasusanto
Traditional clustering algorithms such as k-means and spectral clustering are known to perform poorly for datasets contaminated with even a small number of outliers.
no code implementations • 17 Oct 2019 • Xinjie Fan, Yuguang Yue, Purnamrita Sarkar, Y. X. Rachel Wang
In this paper, we provide a framework with provable guarantees for selecting hyperparameters in a number of distinct models.
no code implementations • NeurIPS 2018 • Soumendu Sundar Mukherjee, Purnamrita Sarkar, Y. X. Rachel Wang, Bowei Yan
Variational approximation has been widely used in large-scale Bayesian inference recently, the simplest kind of which involves imposing a mean field assumption to approximate complicated latent structures.
no code implementations • 2 Oct 2018 • Tianxi Li, Lihua Lei, Sharmodeep Bhattacharyya, Koen Van den Berge, Purnamrita Sarkar, Peter J. Bickel, Elizaveta Levina
This can be done with a simple top-down recursive partitioning algorithm, starting with a single community and separating the nodes into two communities by spectral clustering repeatedly, until a stopping rule suggests there are no further communities.
no code implementations • NeurIPS 2018 • Xueyu Mao, Purnamrita Sarkar, Deepayan Chakrabarti
People belong to multiple communities, words belong to multiple topics, and books cover multiple genres; overlapping clusters are commonplace.
no code implementations • NeurIPS 2017 • Bowei Yan, Mingzhang Yin, Purnamrita Sarkar
In this paper, we study convergence properties of the gradient variant of Expectation-Maximization algorithm~\cite{lange1995gradient} for Gaussian Mixture Models for arbitrary number of clusters and mixing coefficients.
no code implementations • 1 Sep 2017 • Xueyu Mao, Purnamrita Sarkar, Deepayan Chakrabarti
We consider the problem of estimating community memberships of nodes in a network, where every node is associated with a vector determining its degree of membership in each community.
no code implementations • 18 Aug 2017 • Soumendu Sundar Mukherjee, Purnamrita Sarkar, Peter J. Bickel
In this article, we advance divide-and-conquer strategies for solving the community detection problem in networks.
2 code implementations • 30 Jul 2017 • Y. X. Rachel Wang, Purnamrita Sarkar, Oana Ursu, Anshul Kundaje, Peter J. Bickel
However, one of the drawbacks of community detection is that most methods take exchangeability of the nodes in the network for granted; whereas the nodes in this case, i. e. the positions on the chromosomes, are not exchangeable.
Applications Genomics
no code implementations • 24 May 2017 • Bowei Yan, Purnamrita Sarkar, Xiuyuan Cheng
Community detection is a fundamental unsupervised learning problem for unlabeled networks which has a broad range of applications.
no code implementations • 23 May 2017 • Bowei Yan, Mingzhang Yin, Purnamrita Sarkar
In this paper, we study convergence properties of the gradient Expectation-Maximization algorithm \cite{lange1995gradient} for Gaussian Mixture Models for general number of clusters and mixing coefficients.
1 code implementation • 10 Jul 2016 • Bowei Yan, Purnamrita Sarkar
In statistics, an emerging body of work has been focused on combining information from both the edges in the network and the node covariates to infer community memberships.
no code implementations • ICML 2017 • Xueyu Mao, Purnamrita Sarkar, Deepayan Chakrabarti
The problem of finding overlapping communities in networks has gained much attention recently.
1 code implementation • NeurIPS 2017 • Soumendu Sundar Mukherjee, Purnamrita Sarkar, Lizhen Lin
Community detection, which focuses on clustering nodes or detecting communities in (mostly) a single network, is a problem of considerable practical interest and has received a great deal of attention in the research community.
no code implementations • NeurIPS 2016 • Bowei Yan, Purnamrita Sarkar
Clustering is one of the most important unsupervised problems in machine learning and statistics.
no code implementations • NeurIPS 2015 • Purnamrita Sarkar, Deepayan Chakrabarti, Peter J. Bickel
Link prediction and clustering are key problems for network-structureddata.
no code implementations • 12 Nov 2013 • Peter J. Bickel, Purnamrita Sarkar
Community detection in networks is a key exploratory tool with applications in a diverse set of areas, ranging from finding communities in social and biological networks to identifying link farms in the World Wide Web.
no code implementations • 5 Oct 2013 • Purnamrita Sarkar, Peter J. Bickel
The quality of spectral clustering is closely tied to the convergence properties of these principal eigenvectors.
no code implementations • 17 Sep 2012 • Barzan Mozafari, Purnamrita Sarkar, Michael J. Franklin, Michael. I. Jordan, Samuel Madden
Based on this observation, we present two new active learning algorithms to combine humans and algorithms together in a crowd-sourced database.
no code implementations • 27 Jun 2012 • Purnamrita Sarkar, Deepayan Chakrabarti, Michael Jordan
We propose a non-parametric link prediction algorithm for a sequence of graph snapshots over time.
no code implementations • 6 Sep 2011 • Purnamrita Sarkar, Deepayan Chakrabarti, Michael Jordan
We propose a nonparametric approach to link prediction in large-scale dynamic networks.