Tree-SNE: Hierarchical Clustering and Visualization Using t-SNE

13 Feb 2020  ·  Isaac Robinson, Emma Pierce-Hoffman ·

t-SNE and hierarchical clustering are popular methods of exploratory data analysis, particularly in biology. Building on recent advances in speeding up t-SNE and obtaining finer-grained structure, we combine the two to create tree-SNE, a hierarchical clustering and visualization algorithm based on stacked one-dimensional t-SNE embeddings. We also introduce alpha-clustering, which recommends the optimal cluster assignment, without foreknowledge of the number of clusters, based off of the cluster stability across multiple scales. We demonstrate the effectiveness of tree-SNE and alpha-clustering on images of handwritten digits, mass cytometry (CyTOF) data from blood cells, and single-cell RNA-sequencing (scRNA-seq) data from retinal cells. Furthermore, to demonstrate the validity of the visualization, we use alpha-clustering to obtain unsupervised clustering results competitive with the state of the art on several image data sets. Software is available at https://github.com/isaacrob/treesne.

PDF Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Clustering coil-100 Tree-SNE NMI 0.926 # 7
Image Clustering Coil-20 Tree-SNE NMI .958 # 2
Image Clustering MNIST-full Tree-SNE NMI 0.864 # 16
Image Clustering USPS Tree-SNE NMI 0.885 # 12

Methods


No methods listed for this paper. Add relevant methods here