Deep Learning over Multi-field Categorical Data: A Case Study on User Response Prediction

11 Jan 2016  ·  Wei-Nan Zhang, Tianming Du, Jun Wang ·

Predicting user responses, such as click-through rate and conversion rate, are critical in many web applications including web search, personalised recommendation, and online advertising. Different from continuous raw features that we usually found in the image and audio domains, the input features in web space are always of multi-field and are mostly discrete and categorical while their dependencies are little known. Major user response prediction models have to either limit themselves to linear models or require manually building up high-order combination features. The former loses the ability of exploring feature interactions, while the latter results in a heavy computation in the large feature space. To tackle the issue, we propose two novel models using deep neural networks (DNNs) to automatically learn effective patterns from categorical feature interactions and make predictions of users' ad clicks. To get our DNNs efficiently work, we propose to leverage three feature transformation methods, i.e., factorisation machines (FMs), restricted Boltzmann machines (RBMs) and denoising auto-encoders (DAEs). This paper presents the structure of our models and their efficient training algorithms. The large-scale experiments with real-world data demonstrate that our methods work better than major state-of-the-art models.

PDF Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Click-Through Rate Prediction Company* FNN AUC 0.8683 # 2
Log Loss 0.02629 # 2
Click-Through Rate Prediction Criteo FNN AUC 0.7963 # 36
Log Loss 0.45738 # 21
Click-Through Rate Prediction iPinYou FNN AUC 0.7619 # 6

Methods


No methods listed for this paper. Add relevant methods here