A Mutual Contamination Analysis of Mixed Membership and Partial Label Models

19 Feb 2016  ·  Julian Katz-Samuels, Clayton Scott ·

Many machine learning problems can be characterized by mutual contamination models. In these problems, one observes several random samples from different convex combinations of a set of unknown base distributions. It is of interest to decontaminate mutual contamination models, i.e., to recover the base distributions either exactly or up to a permutation. This paper considers the general setting where the base distributions are defined on arbitrary probability spaces. We examine the decontamination problem in two mutual contamination models that describe popular machine learning tasks: recovering the base distributions up to a permutation in a mixed membership model, and recovering the base distributions exactly in a partial label model for classification. We give necessary and sufficient conditions for identifiability of both mutual contamination models, algorithms for both problems in the infinite and finite sample cases, and introduce novel proof techniques based on affine geometry.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here