Powered Dirichlet Process for Controlling the Importance of "Rich-Get-Richer" Prior Assumptions in Bayesian Clustering

26 Apr 2021  ·  Gaël Poux-Médard, Julien Velcin, Sabine Loudcher ·

One of the most used priors in Bayesian clustering is the Dirichlet prior. It can be expressed as a Chinese Restaurant Process. This process allows nonparametric estimation of the number of clusters when partitioning datasets. Its key feature is the "rich-get-richer" property, which assumes a cluster has an a priori probability to get chosen linearly dependent on population. In this paper, we show that such prior is not always the best choice to model data. We derive the Powered Chinese Restaurant process from a modified version of the Dirichlet-Multinomial distribution to answer this problem. We then develop some of its fundamental properties (expected number of clusters, convergence). Unlike state-of-the-art efforts in this direction, this new formulation allows for direct control of the importance of the "rich-get-richer" prior.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here