Nonlinear Markov Clustering by Minimum Curvilinear Sparse Similarity

27 Dec 2019  ·  C. Duran, A. Acevedo, S. Ciucci, A. Muscoloni, CV. Cannistraci ·

The development of algorithms for unsupervised pattern recognition by nonlinear clustering is a notable problem in data science. Markov clustering (MCL) is a renowned algorithm that simulates stochastic flows on a network of sample similarities to detect the structural organization of clusters in the data, but it has never been generalized to deal with data nonlinearity. Minimum Curvilinearity (MC) is a principle that approximates nonlinear sample distances in the high-dimensional feature space by curvilinear distances, which are computed as transversal paths over their minimum spanning tree, and then stored in a kernel. Here we propose MC-MCL, which is the first nonlinear kernel extension of MCL and exploits Minimum Curvilinearity to enhance the performance of MCL in real and synthetic data with underlying nonlinear patterns. MC-MCL is compared with baseline clustering methods, including DBSCAN, K-means and affinity propagation. We find that Minimum Curvilinearity provides a valuable framework to estimate nonlinear distances also when its kernel is applied in combination with MCL. Indeed, MC-MCL overcomes classical MCL and even baseline clustering algorithms in different nonlinear datasets.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here