Multivariate Analysis of Mixed Data: The R Package PCAmixdata

18 Nov 2014  ·  Marie Chavent, Vanessa Kuentz-Simonet, Amaury Labenne, Jérôme Saracco ·

Mixed data arise when observations are described by a mixture of numerical and categorical variables. The R package PCAmixdata extends to this type of data standard multivariate analysis methods which allow description, exploration and visualization of the data. The key techniques/methods included in the package are principal component analysis for mixed data (PCAmix), varimax-like orthogonal rotation for PCAmix, and multiple factor analysis for mixed multi-table data. This paper proposes a unified mathematical presentation of the different methods with common notations, as well as providing a summarised presentation of the three algorithms, with details to help the user understand graphical and numerical outputs of the corresponding R functions. This then allows the user to easily provide relevant interpretations of the results obtained. The three main methods are illustrated on a real dataset composed of four data tables characterizing living conditions in different municipalities in the Gironde region of southwest France.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper