Search Results for author: Carey E. Priebe

Found 82 papers, 28 papers with code

A Statistical Turing Test for Generative Models

no code implementations16 Sep 2023 Hayden Helm, Carey E. Priebe, Weiwei Yang

Implicit in these efforts is an assumption that the generation properties of a human are different from that of the machine.

Gotta match 'em all: Solution diversification in graph matching matched filters

no code implementations25 Aug 2023 Zhirui Li, Ben Johnson, Daniel L. Sussman, Carey E. Priebe, Vince Lyzinski

We present a novel approach for finding multiple noisily embedded template graphs in a very large background graph.

Graph Matching

Comparing Foundation Models using Data Kernels

no code implementations9 May 2023 Brandon Duderstadt, Hayden S. Helm, Carey E. Priebe

Further, we demonstrate how our methodology can be extended to facilitate population level model comparison.

Benchmarking Self-Supervised Learning

Semisupervised regression in latent structure networks on unknown manifolds

no code implementations4 May 2023 Aranyak Acharyya, Joshua Agterberg, Michael W. Trosset, Youngser Park, Carey E. Priebe

We assume that the latent position vectors lie on an unknown one-dimensional curve and are coupled with a response covariate via a regression model.

Graph Embedding regression

Discovering Communication Pattern Shifts in Large-Scale Networks using Encoder Embedding and Vertex Dynamics

1 code implementation3 May 2023 Cencheng Shen, Jonathan Larson, Ha Trinh, Xihan Qin, Youngser Park, Carey E. Priebe

The analysis of large-scale time-series network data, such as social media and email communications, remains a significant challenge for graph analysis methodology.

Time Series

Synergistic Graph Fusion via Encoder Embedding

1 code implementation31 Mar 2023 Cencheng Shen, Carey E. Priebe, Jonathan Larson, Ha Trinh

In this paper, we introduce a novel approach to multi-graph embedding called graph fusion encoder embedding.

Classification Graph Embedding +1

Graph Encoder Ensemble for Simultaneous Vertex Embedding and Community Detection

1 code implementation18 Jan 2023 Cencheng Shen, Youngser Park, Carey E. Priebe

In this paper we propose a novel and computationally efficient method to simultaneously achieve vertex embedding, community detection, and community size determination.

Community Detection

Deep Learning is Provably Robust to Symmetric Label Noise

no code implementations26 Oct 2022 Carey E. Priebe, Ningyuan Huang, Soledad Villar, Cong Mu, Li Chen

We conjecture that for general label noise, mitigation strategies that make use of the noisy data will outperform those that ignore the noisy data.


Dynamic Network Sampling for Community Detection

no code implementations29 Aug 2022 Cong Mu, Youngser Park, Carey E. Priebe

We propose a dynamic network sampling scheme to optimize block recovery for stochastic blockmodel (SBM) in the case where it is prohibitively expensive to observe the entire graph.

Community Detection

The Value of Out-of-Distribution Data

1 code implementation23 Aug 2022 Ashwin De Silva, Rahul Ramesh, Carey E. Priebe, Pratik Chaudhari, Joshua T. Vogelstein

In this work, we show a counter-intuitive phenomenon: the generalization error of a task can be a non-monotonic function of the number of OOD samples.

Data Augmentation Hyperparameter Optimization

Deep Learning with Label Noise: A Hierarchical Approach

no code implementations28 May 2022 Li Chen, Ningyuan Huang, Cong Mu, Hayden S. Helm, Kate Lytvynets, Weiwei Yang, Carey E. Priebe

Our hierarchical approach improves upon regular deep neural networks in learning with label noise.


ART-SS: An Adaptive Rejection Technique for Semi-Supervised restoration for adverse weather-affected images

1 code implementation17 Mar 2022 Rajeev Yasarla, Carey E. Priebe, Vishal Patel

Although various weather degradation synthesis methods exist in the literature, the use of synthetically generated weather degraded images often results in sub-optimal performance on the real weather degraded images due to the domain gap between synthetic and real-world images.

Rain Removal

Mental State Classification Using Multi-graph Features

no code implementations25 Feb 2022 Guodong Chen, Hayden S. Helm, Kate Lytvynets, Weiwei Yang, Carey E. Priebe

We consider the problem of extracting features from passive, multi-channel electroencephalogram (EEG) devices for downstream inference tasks related to high-level mental states such as stress and cognitive load.

Classification EEG +3

Graph Matching via Optimal Transport

1 code implementation9 Nov 2021 Ali Saad-Eldin, Benjamin D. Pedigo, Carey E. Priebe, Joshua T. Vogelstein

The graph matching problem seeks to find an alignment between the nodes of two graphs that minimizes the number of adjacency disagreements.

Graph Matching

Towards a theory of out-of-distribution learning

no code implementations29 Sep 2021 Ali Geisa, Ronak Mehta, Hayden S. Helm, Jayanta Dey, Eric Eaton, Jeffery Dick, Carey E. Priebe, Joshua T. Vogelstein

This assumption renders these theories inadequate for characterizing 21$^{st}$ century real world data problems, which are typically characterized by evaluation distributions that differ from the training data distributions (referred to as out-of-distribution learning).

Learning Theory

One-Hot Graph Encoder Embedding

2 code implementations27 Sep 2021 Cencheng Shen, Qizhe Wang, Carey E. Priebe

In this paper we propose a lightning fast graph embedding method called one-hot graph encoder embedding.

Clustering Graph Embedding +1

Inducing a hierarchy for multi-class classification problems

no code implementations20 Feb 2021 Hayden S. Helm, Weiwei Yang, Sujeeth Bharadwaj, Kate Lytvynets, Oriana Riva, Christopher White, Ali Geisa, Carey E. Priebe

In applications where categorical labels follow a natural hierarchy, classification methods that exploit the label structure often outperform those that do not.

Classification Clustering +2

Subgraph nomination: Query by Example Subgraph Retrieval in Networks

no code implementations29 Jan 2021 Al-Fahad M. Al-Qadhi, Carey E. Priebe, Hayden S. Helm, Vince Lyzinski

This paper introduces the subgraph nomination inference task, in which example subgraphs of interest are used to query a network for similarly interesting subgraphs.

Recommendation Systems Retrieval

A partition-based similarity for classification distributions

no code implementations12 Nov 2020 Hayden S. Helm, Ronak D. Mehta, Brandon Duderstadt, Weiwei Yang, Christoper M. White, Ali Geisa, Joshua T. Vogelstein, Carey E. Priebe

Herein we define a measure of similarity between classification distributions that is both principled from the perspective of statistical pattern recognition and useful from the perspective of machine learning practitioners.

Classification General Classification +2

Multiple Network Embedding for Anomaly Detection in Time Series of Graphs

1 code implementation23 Aug 2020 Guodong Chen, Jesús Arroyo, Avanti Athreya, Joshua Cape, Joshua T. Vogelstein, Youngser Park, Chris White, Jonathan Larson, Weiwei Yang, Carey E. Priebe

We examine two related, complementary inference tasks: the detection of anomalous graphs within a time series, and the detection of temporally anomalous vertices.


On spectral algorithms for community detection in stochastic blockmodel graphs with vertex covariates

1 code implementation4 Jul 2020 Cong Mu, Angelo Mele, Lingxin Hao, Joshua Cape, Avanti Athreya, Carey E. Priebe

In network inference applications, it is often desirable to detect community structure, namely to cluster vertices into groups, or blocks, according to some measure of similarity.

Clustering Community Detection

Vertex Nomination in Richly Attributed Networks

no code implementations29 Apr 2020 Keith Levin, Carey E. Priebe, Vince Lyzinski

In this paper, we explore, both theoretically and practically, the dual roles of content (i. e., edge and vertex attributes) and context (i. e., network topology) in vertex nomination.

Information Retrieval Retrieval

Omnidirectional Transfer for Quasilinear Lifelong Learning

1 code implementation27 Apr 2020 Joshua T. Vogelstein, Jayanta Dey, Hayden S. Helm, Will LeVine, Ronak D. Mehta, Ali Geisa, Haoyin Xu, Gido M. van de Ven, Emily Chang, Chenyu Gao, Weiwei Yang, Bryan Tower, Jonathan Larson, Christopher M. White, Carey E. Priebe

But striving to avoid forgetting sets the goal unnecessarily low: the goal of lifelong learning, whether biological or artificial, should be to improve performance on all tasks (including past and future) with any new data.

Federated Learning Transfer Learning

Learning 1-Dimensional Submanifolds for Subsequent Inference on Random Dot Product Graphs

no code implementations15 Apr 2020 Michael W. Trosset, Mingyue Gao, Minh Tang, Carey E. Priebe

We submit that techniques for manifold learning can be used to learn the unknown submanifold well enough to realize benefit from restricted inference.

On Two Distinct Sources of Nonidentifiability in Latent Position Random Graph Models

no code implementations31 Mar 2020 Joshua Agterberg, Minh Tang, Carey E. Priebe

Two separate and distinct sources of nonidentifiability arise naturally in the context of latent position random graph models, though neither are unique to this setting.

Graph matching between bipartite and unipartite networks: to collapse, or not to collapse, that is the question

1 code implementation5 Feb 2020 Jesús Arroyo, Carey E. Priebe, Vince Lyzinski

Graph matching consists of aligning the vertices of two unlabeled graphs in order to maximize the shared structure across networks; when the graphs are unipartite, this is commonly formulated as minimizing their edge disagreements.

Graph Matching

LqRT: Robust Hypothesis Testing of Location Parameters using Lq-Likelihood-Ratio-Type Test in Python

1 code implementation27 Nov 2019 Anton Alyakin, Yichen Qin, Carey E. Priebe

To the extent that the robustness of the Wilcoxon test (minimum asymptotic relative efficiency (ARE) of the Wilcoxon test vs the t-test is 0. 864) suggests that the Wilcoxon test should be the default test of choice (rather than "use Wilcoxon if there is evidence of non-normality", the default position should be "use Wilcoxon unless there is good reason to believe the normality assumption"), the results in this article suggest that the LqRT is potentially the new default go-to test for practitioners.


Nonpar MANOVA via Independence Testing

no code implementations20 Oct 2019 Sambit Panda, Cencheng Shen, Ronan Perry, Jelle Zorn, Antoine Lutz, Carey E. Priebe, Joshua T. Vogelstein

The $k$-sample testing problem tests whether or not $k$ groups of data points are sampled from the same distribution.

Two-sample testing

Limit theorems for out-of-sample extensions of the adjacency and Laplacian spectral embeddings

no code implementations29 Sep 2019 Keith Levin, Fred Roosta, Minh Tang, Michael W. Mahoney, Carey E. Priebe

In both cases, we prove that when the underlying graph is generated according to a latent space model called the random dot product graph, which includes the popular stochastic block model as a special case, an out-of-sample extension based on a least-squares objective obeys a central limit theorem about the true latent position of the out-of-sample vertex.

Dimensionality Reduction Graph Embedding +1

Spectral inference for large Stochastic Blockmodels with nodal covariates

no code implementations18 Aug 2019 Angelo Mele, Lingxin Hao, Joshua Cape, Carey E. Priebe

In many applications of network analysis, it is important to distinguish between observed and unobserved factors affecting network structure.

Graphyti: A Semi-External Memory Graph Library for FlashGraph

no code implementations7 Jul 2019 Disa Mhembere, Da Zheng, Carey E. Priebe, Joshua T. Vogelstein, Randal Burns

Emerging frameworks avoid the network bottleneck of distributed data with Semi-External Memory (SEM) that uses a single multicore node and operates on graphs larger than memory.

Distributed, Parallel, and Cluster Computing Databases

Geodesic Learning via Unsupervised Decision Forests

no code implementations5 Jul 2019 Meghana Madhyastha, Percy Li, James Browne, Veronika Strnadova-Neeley, Carey E. Priebe, Randal Burns, Joshua T. Vogelstein

Empirical results on simulated and real data demonstrate that URerF is robust to high-dimensional noise, where as other methods, such as Isomap, UMAP, and FLANN, quickly deteriorate in such settings.

Vertex Nomination, Consistent Estimation, and Adversarial Modification

no code implementations6 May 2019 Joshua Agterberg, Youngser Park, Jonathan Larson, Christopher White, Carey E. Priebe, Vince Lyzinski

Given a pair of graphs $G_1$ and $G_2$ and a vertex set of interest in $G_1$, the vertex nomination (VN) problem seeks to find the corresponding vertices of interest in $G_2$ (if they exist) and produce a rank list of the vertices in $G_2$, with the corresponding vertices of interest in $G_2$ concentrating, ideally, at the top of the rank list.

Graph Embedding

Maximum Likelihood Estimation and Graph Matching in Errorfully Observed Networks

no code implementations26 Dec 2018 Jesús Arroyo, Daniel L. Sussman, Carey E. Priebe, Vince Lyzinski

Given a pair of graphs with the same number of vertices, the inexact graph matching problem consists in finding a correspondence between the vertices of these graphs that minimizes the total number of induced edge disagreements.

Graph Matching

Matched Filters for Noisy Induced Subgraph Detection

no code implementations6 Mar 2018 Daniel L. Sussman, Youngser Park, Carey E. Priebe, Vince Lyzinski

To illustrate the possibilities and challenges of such problems, we use an algorithm that can exploit a partially known correspondence and show via varied simulations and applications to {\it Drosophila} and human connectomes that this approach can achieve good performance.

Graph Matching

Out-of-sample extension of graph adjacency spectral embedding

no code implementations ICML 2018 Keith Levin, Farbod Roosta-Khorasani, Michael W. Mahoney, Carey E. Priebe

Many popular dimensionality reduction procedures have out-of-sample extensions, which allow a practitioner to apply a learned embedding to observations not seen in the initial training sample.

Dimensionality Reduction

On consistent vertex nomination schemes

no code implementations15 Nov 2017 Vince Lyzinski, Keith Levin, Carey E. Priebe

Given a vertex of interest in a network $G_1$, the vertex nomination problem seeks to find the corresponding vertex of interest (if it exists) in a second network $G_2$.

Information Retrieval Retrieval

From Distance Correlation to Multiscale Graph Correlation

1 code implementation26 Oct 2017 Cencheng Shen, Carey E. Priebe, Joshua T. Vogelstein

Understanding and developing a correlation measure that can detect general dependencies is not only imperative to statistics and machine learning, but also crucial to general scientific discovery in the big data age.

Statistical inference on random dot product graphs: a survey

no code implementations16 Sep 2017 Avanti Athreya, Donniell E. Fishkind, Keith Levin, Vince Lyzinski, Youngser Park, Yichen Qin, Daniel L. Sussman, Minh Tang, Joshua T. Vogelstein, Carey E. Priebe

In this survey paper, we describe a comprehensive paradigm for statistical inference on random dot product graphs, a paradigm centered on spectral embeddings of adjacency and Laplacian matrices.

Community Detection

Joint Embedding of Graphs

2 code implementations10 Mar 2017 Shangsi Wang, Jesús Arroyo, Joshua T. Vogelstein, Carey E. Priebe

Feature extraction and dimension reduction for networks is critical in a wide variety of domains.

Dimensionality Reduction

Statistical inference for network samples using subgraph counts

1 code implementation2 Jan 2017 P-A. G. Maugis, Carey E. Priebe, S. C. Olhede, P. J. Wolfe

Our results yield joint confidence regions for subgraph counts, and therefore methods for testing whether the observations in a network sample are drawn from: a specified distribution, a specified model, or from the same model as another network sample.

Methodology Social and Information Networks 62G05, 05C80, 62G10

Discovering and Deciphering Relationships Across Disparate Data Modalities

4 code implementations16 Sep 2016 Joshua T. Vogelstein, Eric Bridgeford, Qing Wang, Carey E. Priebe, Mauro Maggioni, Cencheng Shen

Understanding the relationships between different properties of data, such as whether a connectome or genome has information about disease status, is becoming increasingly important in modern biological datasets.

Connectome Smoothing via Low-rank Approximations

no code implementations6 Sep 2016 Runze Tang, Michael Ketcha, Alexandra Badea, Evan D. Calabrese, Daniel S. Margulies, Joshua T. Vogelstein, Carey E. Priebe, Daniel L. Sussman

In statistical connectomics, the quantitative study of brain networks, estimating the mean of a population of graphs based on a sample is a core problem.

Limit theorems for eigenvectors of the normalized Laplacian for random graphs

no code implementations28 Jul 2016 Minh Tang, Carey E. Priebe

As a corollary, we show that for stochastic blockmodel graphs, the rows of the spectral embedding of the normalized Laplacian converge to multivariate normals and furthermore the mean and the covariance matrix of each row are functions of the associated vertex's block membership.

On the Consistency of the Likelihood Maximization Vertex Nomination Scheme: Bridging the Gap Between Maximum Likelihood Estimation and Graph Matching

no code implementations5 Jul 2016 Vince Lyzinski, Keith Levin, Donniell E. Fishkind, Carey E. Priebe

Given a graph in which a few vertices are deemed interesting a priori, the vertex nomination task is to order the remaining vertices into a nomination list such that there is a concentration of interesting vertices at the top of the list.

Graph Matching Stochastic Block Model

knor: A NUMA-Optimized In-Memory, Distributed and Semi-External-Memory k-means Library

1 code implementation28 Jun 2016 Disa Mhembere, Da Zheng, Carey E. Priebe, Joshua T. Vogelstein, Randal Burns

The \textit{k-means NUMA Optimized Routine} (\textsf{knor}) library has (i) in-memory (\textsf{knori}), (ii) distributed memory (\textsf{knord}), and (iii) semi-external memory (\textsf{knors}) modules that radically improve the performance of k-means for varying memory and hardware budgets.

Distributed, Parallel, and Cluster Computing

FlashR: R-Programmed Parallel and Scalable Machine Learning using SSDs

2 code implementations21 Apr 2016 Da Zheng, Disa Mhembere, Joshua T. Vogelstein, Carey E. Priebe, Randal Burns

R is one of the most popular programming languages for statistics and machine learning, but the R framework is relatively slow and unable to scale to large datasets.

Distributed, Parallel, and Cluster Computing

Semi-External Memory Sparse Matrix Multiplication for Billion-Node Graphs

2 code implementations9 Feb 2016 Da Zheng, Disa Mhembere, Vince Lyzinski, Joshua Vogelstein, Carey E. Priebe, Randal Burns

In contrast, we scale sparse matrix multiplication beyond memory capacity by implementing sparse matrix dense matrix multiplication (SpMM) in a semi-external memory (SEM) fashion; i. e., we keep the sparse matrix on commodity SSDs and dense matrices in memory.

Distributed, Parallel, and Cluster Computing

Semi-supervised K-means++

no code implementations1 Feb 2016 Jordan Yoder, Carey E. Priebe

Traditionally, practitioners initialize the {\tt k-means} algorithm with centers chosen uniformly at random.


Sparse Projection Oblique Randomer Forests

2 code implementations10 Jun 2015 Tyler M. Tomita, James Browne, Cencheng Shen, Jaewon Chung, Jesse L. Patsolic, Benjamin Falk, Jason Yim, Carey E. Priebe, Randal Burns, Mauro Maggioni, Joshua T. Vogelstein

Unfortunately, these extensions forfeit one or more of the favorable properties of decision forests based on axis-aligned splits, such as robustness to many noise dimensions, interpretability, or computational efficiency.

Fast Embedding for JOFC Using the Raw Stress Criterion

no code implementations11 Feb 2015 Vince Lyzinski, Youngser Park, Carey E. Priebe, Michael W. Trosset

The Joint Optimization of Fidelity and Commensurability (JOFC) manifold matching methodology embeds an omnibus dissimilarity matrix consisting of multiple dissimilarities on the same set of objects.

Sparse Representation Classification Beyond L1 Minimization and the Subspace Assumption

no code implementations4 Feb 2015 Cencheng Shen, Li Chen, Yuexiao Dong, Carey E. Priebe

The results are demonstrated via simulations and real data experiments, where the new algorithm achieves comparable numerical performance and significantly faster.

Classification Classification Consistency +1

Manifold Matching using Shortest-Path Distance and Joint Neighborhood Selection

1 code implementation12 Dec 2014 Cencheng Shen, Joshua T. Vogelstein, Carey E. Priebe

Then the shortest-path distance within each modality is calculated from the joint neighborhood graph, followed by embedding into and matching in a common low-dimensional Euclidean space.

Empirical Bayes Estimation for the Stochastic Blockmodel

no code implementations23 May 2014 Shakira Suwan, Dominic S. Lee, Runze Tang, Daniel L. Sussman, Minh Tang, Carey E. Priebe

Inference for the stochastic blockmodel is currently of burgeoning interest in the statistical community, as well as in various application domains as diverse as social networks, citation networks, brain connectivity networks (connectomics), etc.

Graph Matching: Relax at Your Own Risk

no code implementations13 May 2014 Vince Lyzinski, Donniell Fishkind, Marcelo Fiori, Joshua T. Vogelstein, Carey E. Priebe, Guillermo Sapiro

Indeed, experimental results illuminate and corroborate these theoretical findings, demonstrating that excellent results are achieved in both benchmark and real data problems by amalgamating the two approaches.

Graph Matching

A central limit theorem for scaled eigenvectors of random dot product graphs

no code implementations31 May 2013 Avanti Athreya, Vince Lyzinski, David J. Marchette, Carey E. Priebe, Daniel L. Sussman, Minh Tang

We prove a central limit theorem for the components of the largest eigenvectors of the adjacency matrix of a finite-dimensional random dot product graph whose true latent positions are unknown.

Out-of-sample Extension for Latent Position Graphs

no code implementations21 May 2013 Minh Tang, Youngser Park, Carey E. Priebe

We show that, under the latent position graph model and for sufficiently large $n$, the mapping of the out-of-sample vertices is close to its true latent position.

General Classification Graph Embedding

Generalized Canonical Correlation Analysis for Classification

no code implementations30 Apr 2013 Cencheng Shen, Ming Sun, Minh Tang, Carey E. Priebe

For multiple multivariate data sets, we derive conditions under which Generalized Canonical Correlation Analysis (GCCA) improves classification performance of the projected datasets, compared to standard Canonical Correlation Analysis (CCA) using only two data sets.

Classification General Classification

On the Incommensurability Phenomenon

no code implementations9 Jan 2013 Donniell E. Fishkind, Cencheng Shen, Youngser Park, Carey E. Priebe

Suppose that two large, multi-dimensional data sets are each noisy measurements of the same underlying random process, and principle components analysis is performed separately on the data sets to reduce their dimensionality.

Universally consistent vertex classification for latent positions graphs

no code implementations5 Dec 2012 Minh Tang, Daniel L. Sussman, Carey E. Priebe

In this work we show that, using the eigen-decomposition of the adjacency matrix, we can consistently estimate feature maps for latent position graphs with positive definite link function $\kappa$, provided that the latent positions are i. i. d.

Classification General Classification

Statistical inference on errorfully observed graphs

no code implementations15 Nov 2012 Carey E. Priebe, Daniel L. Sussman, Minh Tang, Joshua T. Vogelstein

Thus we errorfully observe $G$ when we observe the graph $\widetilde{G} = (V,\widetilde{E})$ as the edges in $\widetilde{E}$ arise from the classifications of the "edge-features", and are expected to be errorful.

Seeded Graph Matching

no code implementations3 Sep 2012 Donniell E. Fishkind, Sancar Adali, Heather G. Patsolic, Lingyao Meng, Digvijay Singh, Vince Lyzinski, Carey E. Priebe

Given two graphs, the graph matching problem is to align the two vertex sets so as to minimize the number of adjacency disagreements between the two graphs.

Graph Matching

On latent position inference from doubly stochastic messaging activities

no code implementations26 May 2012 Nam H. Lee, Jordan Yoder, Minh Tang, Carey E. Priebe

Each of the message-exchanging actors is modeled as a process in a latent space.

Cannot find the paper you are looking for? You can Submit a new open access paper.