Search Results for author: Cencheng Shen

Found 24 papers, 13 papers with code

Edge-Parallel Graph Encoder Embedding

1 code implementation • 6 Feb 2024 • Ariel Lubonja, Cencheng Shen, Carey Priebe, Randal Burns

New algorithms for embedding graphs have reduced the asymptotic complexity of finding low-dimensional representations.

Paper
Code

Learning sources of variability from high-dimensional observational studies

1 code implementation • 26 Jul 2023 • Eric W. Bridgeford, Jaewon Chung, Brian Gilbert, Sambit Panda, Adam Li, Cencheng Shen, Alexandra Badea, Brian Caffo, Joshua T. Vogelstein

Causal inference studies whether the presence of a variable influences an observed outcome.

Causal Inference

Paper
Code

Discovering Communication Pattern Shifts in Large-Scale Labeled Networks using Encoder Embedding and Vertex Dynamics

1 code implementation • 3 May 2023 • Cencheng Shen, Jonathan Larson, Ha Trinh, Xihan Qin, Youngser Park, Carey E. Priebe

Analyzing large-scale time-series network data, such as social media and email communications, poses a significant challenge in understanding social dynamics, detecting anomalies, and predicting trends.

Time Series

Paper
Code

Synergistic Graph Fusion via Encoder Embedding

1 code implementation • 31 Mar 2023 • Cencheng Shen, Carey E. Priebe, Jonathan Larson, Ha Trinh

In this paper, we introduce a novel approach called graph fusion embedding, designed for multi-graph embedding with shared vertex sets.

Classification Graph Embedding +1

Paper
Code

Graph Encoder Ensemble for Simultaneous Vertex Embedding and Community Detection

1 code implementation • 18 Jan 2023 • Cencheng Shen, Youngser Park, Carey E. Priebe

In this paper, we introduce a novel and computationally efficient method for vertex embedding, community detection, and community size determination.

Community Detection

Paper
Code

One-Hot Graph Encoder Embedding

3 code implementations • 27 Sep 2021 • Cencheng Shen, Qizhe Wang, Carey E. Priebe

In this paper we propose a lightning fast graph embedding method called one-hot graph encoder embedding.

Clustering Graph Embedding +1

Paper
Code

A Simple Spectral Failure Mode for Graph Convolutional Networks

no code implementations • 25 Oct 2020 • Carey E. Priebe, Cencheng Shen, Ningyuan Huang, Tianyi Chen

Neural networks have achieved remarkable successes in machine learning tasks.

Graph Embedding Graph Learning

Paper
Add Code

High-Dimensional Independence Testing via Maximum and Average Distance Correlations

no code implementations • 4 Jan 2020 • Cencheng Shen, Yuexiao Dong

This paper introduces and investigates the utilization of maximum and average distance correlations for multivariate independence testing.

valid Vocal Bursts Intensity Prediction

Paper
Add Code

The Chi-Square Test of Distance Correlation

1 code implementation • 27 Dec 2019 • Cencheng Shen, Sambit Panda, Joshua T. Vogelstein

One major bottleneck is the testing process: because the null distribution of distance correlation depends on the underlying random variables and metric choice, it typically requires a permutation test to estimate the null and compute the p-value, which is very costly for large amount of data.

valid

Paper
Code

High-dimensional and universally consistent k-sample tests

no code implementations • 20 Oct 2019 • Sambit Panda, Cencheng Shen, Ronan Perry, Jelle Zorn, Antoine Lutz, Carey E. Priebe, Joshua T. Vogelstein

The evaluation included several popular independence statistics and covered a comprehensive set of simulations.

Two-sample testing

Paper
Add Code

Independence Testing for Temporal Data

no code implementations • 18 Aug 2019 • Cencheng Shen, Jaewon Chung, Ronak Mehta, Ting Xu, Joshua T. Vogelstein

While many non-parametric and universally consistent dependence measures have recently been proposed, directly applying them to temporal data can inflate the p-value and result in invalid test.

Time Series Time Series Analysis +1

Paper
Add Code

hyppo: A Multivariate Hypothesis Testing Python Package

4 code implementations • 3 Jul 2019 • Sambit Panda, Satish Palaniappan, Junhao Xiong, Eric W. Bridgeford, Ronak Mehta, Cencheng Shen, Joshua T. Vogelstein

We introduce hyppo, a unified library for performing multivariate hypothesis testing, including independence, two-sample, and k-sample testing.

Two-sample testing

214

Paper
Code

Random Forests for Adaptive Nearest Neighbor Estimation of Information-Theoretic Quantities

1 code implementation • 30 Jun 2019 • Ronan Perry, Ronak Mehta, Richard Guo, Eva Yezerets, Jesús Arroyo, Mike Powell, Hayden Helm, Cencheng Shen, Joshua T. Vogelstein

Information-theoretic quantities, such as conditional entropy and mutual information, are critical data summaries for quantifying uncertainty.

Paper
Code

Sparse Representation Classification via Screening for Graphs

no code implementations • 4 Jun 2019 • Cencheng Shen, Li Chen, Yuexiao Dong, Carey Priebe

The sparse representation classifier (SRC) is shown to work well for image recognition problems that satisfy a subspace assumption.

Classification Classification Consistency +1

Paper
Add Code

Learning Interpretable Characteristic Kernels via Decision Forests

no code implementations • 30 Nov 2018 • Sambit Panda, Cencheng Shen, Joshua T. Vogelstein

Decision forests are widely used for classification and regression tasks.

Feature Importance General Classification

Paper
Add Code

The Exact Equivalence of Distance and Kernel Methods for Hypothesis Testing

no code implementations • 14 Jun 2018 • Cencheng Shen, Joshua T. Vogelstein

Distance-based tests, also called "energy statistics", are leading methods for two-sample and independence tests from the statistics community.

Two-sample testing

Paper
Add Code

From Distance Correlation to Multiscale Graph Correlation

1 code implementation • 26 Oct 2017 • Cencheng Shen, Carey E. Priebe, Joshua T. Vogelstein

Understanding and developing a correlation measure that can detect general dependencies is not only imperative to statistics and machine learning, but also crucial to general scientific discovery in the big data age.

Paper
Code

Discovering and Deciphering Relationships Across Disparate Data Modalities

4 code implementations • 16 Sep 2016 • Joshua T. Vogelstein, Eric Bridgeford, Qing Wang, Carey E. Priebe, Mauro Maggioni, Cencheng Shen

Understanding the relationships between different properties of data, such as whether a connectome or genome has information about disease status, is becoming increasingly important in modern biological datasets.

Computational Efficiency

214

Paper
Code

Sparse Projection Oblique Randomer Forests

2 code implementations • 10 Jun 2015 • Tyler M. Tomita, James Browne, Cencheng Shen, Jaewon Chung, Jesse L. Patsolic, Benjamin Falk, Jason Yim, Carey E. Priebe, Randal Burns, Mauro Maggioni, Joshua T. Vogelstein

Unfortunately, these extensions forfeit one or more of the favorable properties of decision forests based on axis-aligned splits, such as robustness to many noise dimensions, interpretability, or computational efficiency.

Computational Efficiency

Paper
Code

Sparse Representation Classification Beyond L1 Minimization and the Subspace Assumption

no code implementations • 4 Feb 2015 • Cencheng Shen, Li Chen, Yuexiao Dong, Carey E. Priebe

The results are demonstrated via simulations and real data experiments, where the new algorithm achieves comparable numerical performance and significantly faster.

Classification Classification Consistency +1

Paper
Add Code

Manifold Matching using Shortest-Path Distance and Joint Neighborhood Selection

1 code implementation • 12 Dec 2014 • Cencheng Shen, Joshua T. Vogelstein, Carey E. Priebe

Then the shortest-path distance within each modality is calculated from the joint neighborhood graph, followed by embedding into and matching in a common low-dimensional Euclidean space.

Paper
Code

Robust Vertex Classification

no code implementations • 23 Nov 2013 • Li Chen, Cencheng Shen, Joshua Vogelstein, Carey Priebe

For random graphs distributed according to stochastic blockmodels, a special case of latent position graphs, adjacency spectral embedding followed by appropriate vertex classification is asymptotically Bayes optimal; but this approach requires knowledge of and critically depends on the model dimension.

Classification General Classification +1

Paper
Add Code

Generalized Canonical Correlation Analysis for Classification

no code implementations • 30 Apr 2013 • Cencheng Shen, Ming Sun, Minh Tang, Carey E. Priebe

For multiple multivariate data sets, we derive conditions under which Generalized Canonical Correlation Analysis (GCCA) improves classification performance of the projected datasets, compared to standard Canonical Correlation Analysis (CCA) using only two data sets.

Classification General Classification

Paper
Add Code

On the Incommensurability Phenomenon

no code implementations • 9 Jan 2013 • Donniell E. Fishkind, Cencheng Shen, Youngser Park, Carey E. Priebe

Suppose that two large, multi-dimensional data sets are each noisy measurements of the same underlying random process, and principle components analysis is performed separately on the data sets to reduce their dimensionality.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.