Collectively Embedding Multi-Relational Data for Predicting User Preferences

23 Apr 2015 · Nitish Gupta, Sameer Singh ·

Matrix factorization has found incredible success and widespread application as a collaborative filtering based approach to recommendations. Unfortunately, incorporating additional sources of evidence, especially ones that are incomplete and noisy, is quite difficult to achieve in such models, however, is often crucial for obtaining further gains in accuracy. For example, additional information about businesses from reviews, categories, and attributes should be leveraged for predicting user preferences, even though this information is often inaccurate and partially-observed. Instead of creating customized methods that are specific to each type of evidences, in this paper we present a generic approach to factorization of relational data that collectively models all the relations in the database. By learning a set of embeddings that are shared across all the relations, the model is able to incorporate observed information from all the relations, while also predicting all the relations of interest. Our evaluation on multiple Amazon and Yelp datasets demonstrates effective utilization of additional information for held-out preference prediction, but further, we present accurate models even for the cold-starting businesses and products for which we do not observe any ratings or reviews. We also illustrate the capability of the model in imputing missing information and jointly visualizing words, categories, and attribute factors.

PDF Abstract