ESDF: Ensemble Selection using Diversity and Frequency

18 Aug 2015  ·  Shouvick Mondal, Arko Banerjee ·

Recently ensemble selection for consensus clustering has emerged as a research problem in Machine Intelligence. Normally consensus clustering algorithms take into account the entire ensemble of clustering, where there is a tendency of generating a very large size ensemble before computing its consensus. One can avoid considering the entire ensemble and can judiciously select few partitions in the ensemble without compromising on the quality of the consensus. This may result in an efficient consensus computation technique and may save unnecessary computational overheads. The ensemble selection problem addresses this issue of consensus clustering. In this paper, we propose an efficient method of ensemble selection for a large ensemble. We prioritize the partitions in the ensemble based on diversity and frequency. Our method selects top K of the partitions in order of priority, where K is decided by the user. We observe that considering jointly the diversity and frequency helps in identifying few representative partitions whose consensus is qualitatively better than the consensus of the entire ensemble. Experimental analysis on a large number of datasets shows our method gives better results than earlier ensemble selection methods.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here