Search Results for author: Minh Tang

Found 27 papers, 3 papers with code

Regression for matrix-valued data via Kronecker products factorization

no code implementations • 30 Apr 2024 • Yin-Jen Chen, Minh Tang

We study the matrix-variate regression problem $Y_i = \sum_{k} \beta_{1k} X_i \beta_{2k}^{\top} + E_i$ for $i=1, 2\dots, n$ in the high dimensional regime wherein the response $Y_i$ are matrices whose dimensions $p_{1}\times p_{2}$ outgrow both the sample size $n$ and the dimensions $q_{1}\times q_{2}$ of the predictor variables $X_i$ i. e., $q_{1}, q_{2} \ll n \ll p_{1}, p_{2}$.

Paper
Add Code

Adversarial contamination of networks in the setting of vertex nomination: a new trimming method

no code implementations • 20 Aug 2022 • Sheyda Peyman, Minh Tang, Vince Lyzinski

Here, a common suite of methods relies on spectral graph embeddings, which have been shown to provide both good algorithmic performance and flexible settings in which regularization techniques can be implemented to help mitigate the effect of an adversary.

Information Retrieval Retrieval

Paper
Add Code

Perturbation Analysis of Randomized SVD and its Applications to High-dimensional Statistics

no code implementations • 19 Mar 2022 • Yichi Zhang, Minh Tang

We first derive upper bounds for the $\ell_2$ (spectral norm) and $\ell_{2\to\infty}$ (maximum row-wise $\ell_2$ norm) distances between the approximate singular vectors of $\hat{\mathbf{M}}$ and the true singular vectors of the signal matrix $\mathbf{M}$.

Community Detection Matrix Completion

Paper
Add Code

Classification of high-dimensional data with spiked covariance matrix structure

no code implementations • 5 Oct 2021 • Yin-Jen Chen, Minh Tang

We study the classification problem for high-dimensional data with $n$ observations on $p$ features where the $p \times p$ covariance matrix $\Sigma$ exhibits a spiked eigenvalues structure and the vector $\zeta$, given by the difference between the whitened mean vectors, is sparse with sparsity at most $s$.

Classification Dimensionality Reduction +1

Paper
Add Code

Popularity Adjusted Block Models are Generalized Random Dot Product Graphs

1 code implementation • 9 Sep 2021 • John Koo, Minh Tang, Michael W. Trosset

We connect two random graph models, the Popularity Adjusted Block Model (PABM) and the Generalized Random Dot Product Graph (GRDPG), by demonstrating that the PABM is a special case of the GRDPG in which communities correspond to mutually orthogonal subspaces of latent vectors.

Clustering Community Detection

Paper
Code

Hypothesis Testing for Equality of Latent Positions in Random Graphs

no code implementations • 23 May 2021 • Xinjie Du, Minh Tang

Special cases of this hypothesis test include testing whether two vertices in a stochastic block model or degree-corrected stochastic block model graph have the same block membership vectors, or testing whether two vertices in a popularity adjusted block model have the same community assignment.

Model Selection Stochastic Block Model

Paper
Add Code

Exact Recovery of Community Structures Using DeepWalk and Node2vec

no code implementations • 18 Jan 2021 • Yichi Zhang, Minh Tang

Random-walk based network embedding algorithms like DeepWalk and node2vec are widely used to obtain Euclidean representation of the nodes in a network prior to performing downstream inference tasks.

Clustering Community Detection +1

Paper
Add Code

Nonparametric Two-Sample Hypothesis Testing for Random Graphs with Negative and Repeated Eigenvalues

no code implementations • 17 Dec 2020 • Joshua Agterberg, Minh Tang, Carey Priebe

We propose a nonparametric two-sample test statistic for low-rank, conditionally independent edge random graphs whose edge probability matrices have negative eigenvalues and arbitrarily close eigenvalues.

Graph Embedding Statistics Theory Statistics Theory

Paper
Add Code

Learning 1-Dimensional Submanifolds for Subsequent Inference on Random Dot Product Graphs

no code implementations • 15 Apr 2020 • Michael W. Trosset, Mingyue Gao, Minh Tang, Carey E. Priebe

We submit that techniques for manifold learning can be used to learn the unknown submanifold well enough to realize benefit from restricted inference.

Paper
Add Code

On Two Distinct Sources of Nonidentifiability in Latent Position Random Graph Models

no code implementations • 31 Mar 2020 • Joshua Agterberg, Minh Tang, Carey E. Priebe

Two separate and distinct sources of nonidentifiability arise naturally in the context of latent position random graph models, though neither are unique to this setting.

Position

Paper
Add Code

Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings

no code implementations • 29 Sep 2019 • Keith Levin, Fred Roosta, Minh Tang, Michael W. Mahoney, Carey E. Priebe

In both cases, we prove that when the underlying graph is generated according to a latent space model called the random dot product graph, which includes the popular stochastic block model as a special case, an out-of-sample extension based on a least-squares objective obeys a central limit theorem about the true latent position of the out-of-sample vertex.

Dimensionality Reduction Graph Embedding +1

Paper
Add Code

On a 'Two Truths' Phenomenon in Spectral Graph Clustering

no code implementations • 23 Aug 2018 • Carey E. Priebe, Youngser Park, Joshua T. Vogelstein, John M. Conroy, Vince Lyzinski, Minh Tang, Avanti Athreya, Joshua Cape, Eric Bridgeford

Clustering is concerned with coherently grouping observations without any explicit concept of true groupings.

Clustering Graph Clustering +2

Paper
Add Code

The eigenvalues of stochastic blockmodel graphs

no code implementations • 30 Mar 2018 • Minh Tang

We derive the limiting distribution for the largest eigenvalues of the adjacency matrix for a stochastic blockmodel graph when the number of vertices tends to infinity.

Paper
Add Code

A statistical interpretation of spectral embedding: the generalised random dot product graph

no code implementations • 16 Sep 2017 • Patrick Rubin-Delanchy, Joshua Cape, Minh Tang, Carey E. Priebe

Spectral embedding is a procedure which can be used to obtain vector representations of the nodes of a graph.

Link Prediction Position +1

Paper
Add Code

Statistical inference on random dot product graphs: a survey

no code implementations • 16 Sep 2017 • Avanti Athreya, Donniell E. Fishkind, Keith Levin, Vince Lyzinski, Youngser Park, Yichen Qin, Daniel L. Sussman, Minh Tang, Joshua T. Vogelstein, Carey E. Priebe

In this survey paper, we describe a comprehensive paradigm for statistical inference on random dot product graphs, a paradigm centered on spectral embeddings of adjacency and Laplacian matrices.

Community Detection

Paper
Add Code

Supervised Dimensionality Reduction for Big Data

1 code implementation • 5 Sep 2017 • Joshua T. Vogelstein, Eric Bridgeford, Minh Tang, Da Zheng, Christopher Douville, Randal Burns, Mauro Maggioni

To solve key biomedical problems, experimentalists now routinely measure millions or billions of features (dimensions) per sample, with the hope that data science techniques will be able to build accurate data-driven inferences.

Computational Efficiency General Classification +2

Paper
Code

Semiparametric spectral modeling of the Drosophila connectome

1 code implementation • 9 May 2017 • Carey E. Priebe, Youngser Park, Minh Tang, Avanti Athreya, Vince Lyzinski, Joshua T. Vogelstein, Yichen Qin, Ben Cocanougher, Katharina Eichler, Marta Zlatic, Albert Cardona

We present semiparametric spectral modeling of the complete larval Drosophila mushroom body connectome.

Position Stochastic Block Model

Paper
Code

Limit theorems for eigenvectors of the normalized Laplacian for random graphs

no code implementations • 28 Jul 2016 • Minh Tang, Carey E. Priebe

As a corollary, we show that for stochastic blockmodel graphs, the rows of the spectral embedding of the normalized Laplacian converge to multivariate normals and furthermore the mean and the covariance matrix of each row are functions of the associated vertex's block membership.

Paper
Add Code

Community Detection and Classification in Hierarchical Stochastic Blockmodels

no code implementations • 7 Mar 2015 • Vince Lyzinski, Minh Tang, Avanti Athreya, Youngser Park, Carey E. Priebe

We propose a robust, scalable, integrated methodology for community detection and community comparison in graphs.

Classification Community Detection +1

Paper
Add Code

Empirical Bayes Estimation for the Stochastic Blockmodel

no code implementations • 23 May 2014 • Shakira Suwan, Dominic S. Lee, Runze Tang, Daniel L. Sussman, Minh Tang, Carey E. Priebe

Inference for the stochastic blockmodel is currently of burgeoning interest in the statistical community, as well as in various application domains as diverse as social networks, citation networks, brain connectivity networks (connectomics), etc.

Position

Paper
Add Code

Perfect Clustering for Stochastic Blockmodel Graphs via Adjacency Spectral Embedding

no code implementations • 2 Oct 2013 • Vince Lyzinski, Daniel Sussman, Minh Tang, Avanti Athreya, Carey Priebe

Vertex clustering in a stochastic blockmodel graph has wide applicability and has been the subject of extensive research.

Clustering

Paper
Add Code

A central limit theorem for scaled eigenvectors of random dot product graphs

no code implementations • 31 May 2013 • Avanti Athreya, Vince Lyzinski, David J. Marchette, Carey E. Priebe, Daniel L. Sussman, Minh Tang

We prove a central limit theorem for the components of the largest eigenvectors of the adjacency matrix of a finite-dimensional random dot product graph whose true latent positions are unknown.

Paper
Add Code

Out-of-sample Extension for Latent Position Graphs

no code implementations • 21 May 2013 • Minh Tang, Youngser Park, Carey E. Priebe

We show that, under the latent position graph model and for sufficiently large $n$, the mapping of the out-of-sample vertices is close to its true latent position.

General Classification Graph Embedding +1

Paper
Add Code

Generalized Canonical Correlation Analysis for Classification

no code implementations • 30 Apr 2013 • Cencheng Shen, Ming Sun, Minh Tang, Carey E. Priebe

For multiple multivariate data sets, we derive conditions under which Generalized Canonical Correlation Analysis (GCCA) improves classification performance of the projected datasets, compared to standard Canonical Correlation Analysis (CCA) using only two data sets.

Classification General Classification

Paper
Add Code

Universally consistent vertex classification for latent positions graphs

no code implementations • 5 Dec 2012 • Minh Tang, Daniel L. Sussman, Carey E. Priebe

In this work we show that, using the eigen-decomposition of the adjacency matrix, we can consistently estimate feature maps for latent position graphs with positive definite link function $\kappa$, provided that the latent positions are i. i. d.

Classification General Classification +1

Paper
Add Code

Statistical inference on errorfully observed graphs

no code implementations • 15 Nov 2012 • Carey E. Priebe, Daniel L. Sussman, Minh Tang, Joshua T. Vogelstein

Thus we errorfully observe $G$ when we observe the graph $\widetilde{G} = (V,\widetilde{E})$ as the edges in $\widetilde{E}$ arise from the classifications of the "edge-features", and are expected to be errorful.

Paper
Add Code

On latent position inference from doubly stochastic messaging activities

no code implementations • 26 May 2012 • Nam H. Lee, Jordan Yoder, Minh Tang, Carey E. Priebe

Each of the message-exchanging actors is modeled as a process in a latent space.

Position

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.