Informative Data Projections: A Framework and Two Examples

27 Nov 2015  ·  Tijl De Bie, Jefrey Lijffijt, Raul Santos-Rodriguez, Bo Kang ·

Methods for Projection Pursuit aim to facilitate the visual exploration of high-dimensional data by identifying interesting low-dimensional projections. A major challenge is the design of a suitable quality metric of projections, commonly referred to as the projection index, to be maximized by the Projection Pursuit algorithm. In this paper, we introduce a new information-theoretic strategy for tackling this problem, based on quantifying the amount of information the projection conveys to a user given their prior beliefs about the data. The resulting projection index is a subjective quantity, explicitly dependent on the intended user. As a useful illustration, we developed this idea for two particular kinds of prior beliefs. The first kind leads to PCA (Principal Component Analysis), shining new light on when PCA is (not) appropriate. The second kind leads to a novel projection index, the maximization of which can be regarded as a robust variant of PCA. We show how this projection index, though non-convex, can be effectively maximized using a modified power method as well as using a semidefinite programming relaxation. The usefulness of this new projection index is demonstrated in comparative empirical experiments against PCA and a popular Projection Pursuit method.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods