Git: Clustering Based on Graph of Intensity Topology

4 Oct 2021  ยท  Zhangyang Gao, Haitao Lin, Cheng Tan, Lirong Wu, Stan. Z Li ยท

\textbf{A}ccuracy, \textbf{R}obustness to noises and scales, \textbf{I}nterpretability, \textbf{S}peed, and \textbf{E}asy to use (ARISE) are crucial requirements of a good clustering algorithm. However, achieving these goals simultaneously is challenging, and most advanced approaches only focus on parts of them. Towards an overall consideration of these aspects, we propose a novel clustering algorithm, namely GIT (Clustering Based on \textbf{G}raph of \textbf{I}ntensity \textbf{T}opology). GIT considers both local and global data structures: firstly forming local clusters based on intensity peaks of samples, and then estimating the global topological graph (topo-graph) between these local clusters. We use the Wasserstein Distance between the predicted and prior class proportions to automatically cut noisy edges in the topo-graph and merge connected local clusters as final clusters. Then, we compare GIT with seven competing algorithms on five synthetic datasets and nine real-world datasets. With fast local cluster detection, robust topo-graph construction and accurate edge-cutting, GIT shows attractive ARISE performance and significantly exceeds other non-convex clustering methods. For example, GIT outperforms its counterparts about $10\%$ (F1-score) on MNIST and FashionMNIST. Code is available at \color{red}{https://github.com/gaozhangyang/GIT}.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Clustering Algorithms Evaluation Fashion-MNIST AE+GIT F1-score 65% # 1
ARI 49% # 1
NMI 61% # 1
Clustering Algorithms Evaluation Fashion-MNIST GIT F1-score 56% # 2
ARI 32% # 4
NMI 51% # 2
Clustering Algorithms Evaluation Fashion-MNIST k-Means++ F1-score 39% # 6
ARI 35% # 2
NMI 51% # 2
Clustering Algorithms Evaluation Fashion-MNIST Spectral Clustering F1-score 43% # 4
ARI 34% # 3
NMI 49% # 4
Clustering Algorithms Evaluation Fashion-MNIST QuickShiftPP F1-score 42% # 5
ARI 16% # 6
NMI 41% # 6
Clustering Algorithms Evaluation Fashion-MNIST SpectACI F1-score 47% # 3
ARI 29% # 5
NMI 45% # 5
Clustering Algorithms Evaluation MNIST AE+GIT F1-score 88% # 1
ARI 77% # 1
NMI 81% # 1
Clustering Algorithms Evaluation MNIST k-Means++ F1-score 50% # 3
ARI 36% # 3
NMI 45% # 3
Clustering Algorithms Evaluation MNIST GIT F1-score 59% # 2
ARI 42% # 2
NMI 53% # 2
Clustering Algorithms Evaluation MNIST Spectral Clustering F1-score 41% # 5
ARI 33% # 4
NMI 44% # 5
Clustering Algorithms Evaluation MNIST QuickShiftPP F1-score 45% # 4
ARI 13% # 6
NMI 45% # 3
Clustering Algorithms Evaluation MNIST SpectACI F1-score 40% # 6
ARI 17% # 5
NMI 33% # 6
Clustering Algorithms Evaluation Olivetti face k-Means++ F1-score 52% # 3
NMI 74% # 3
ARI 38% # 2
Clustering Algorithms Evaluation Olivetti face Spectral Clustering F1-score 37% # 4
NMI 66% # 4
ARI 19% # 5
Clustering Algorithms Evaluation Olivetti face QuickShiftPP F1-score 60% # 2
NMI 79% # 1
ARI 38% # 2
Clustering Algorithms Evaluation Olivetti face SpectACI F1-score 34% # 5
NMI 61% # 5
ARI 21% # 4
Clustering Algorithms Evaluation Olivetti face GIT F1-score 62% # 1
NMI 78% # 2
ARI 45% # 1

Methods


No methods listed for this paper. Add relevant methods here