Feature Engineering

249 papers with code • 1 benchmarks • 5 datasets

Feature engineering is the process of taking a dataset and constructing explanatory variables — features — that can be used to train a machine learning model for a prediction problem. Often, data is spread across multiple tables and must be gathered into a single table with rows containing the observations and features in the columns.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is problem-dependent and must be re-written for each new dataset.

Greatest papers with code

DDGK: Learning Graph Representations for Deep Divergence Graph Kernels

google-research/google-research 21 Apr 2019

Second, for each pair of graphs, we train a cross-graph attention network which uses the node representations of an anchor graph to reconstruct another graph.

Feature Engineering Graph Attention +2

Named Entity Recognition with Bidirectional LSTM-CNNs

flairNLP/flair TACL 2016

Named entity recognition is a challenging task that has traditionally required large amounts of knowledge in the form of feature engineering and lexicons to achieve high performance.

Entity Linking Feature Engineering +2

A Surprising Thing: The Application of Machine Learning Ensembles and Signal Theory to Predict Earnings Surprises

firmai/industry-machine-learning PhD Thesis 2017

Nonlinear classification models can predict future earnings surprises with a high accuracy by using pricing and earnings input data.

Feature Engineering

Product-based Neural Networks for User Response Prediction over Multi-field Categorical Data

shenweichen/DeepCTR 1 Jul 2018

User response prediction is a crucial component for personalized information retrieval and filtering scenarios, such as recommender system and web search.

Click-Through Rate Prediction Feature Engineering +2

DeepFM: An End-to-End Wide & Deep Learning Framework for CTR Prediction

shenweichen/DeepCTR 12 Apr 2018

In this paper, we study two instances of DeepFM where its "deep" component is DNN and PNN respectively, for which we denote as DeepFM-D and DeepFM-P. Comprehensive experiments are conducted to demonstrate the effectiveness of DeepFM-D and DeepFM-P over the existing models for CTR prediction, on both benchmark data and commercial data.

Click-Through Rate Prediction Feature Engineering +1

Deep & Cross Network for Ad Click Predictions

shenweichen/DeepCTR 17 Aug 2017

Feature engineering has been the key to the success of many prediction models.

Ranked #5 on Click-Through Rate Prediction on Criteo (Log Loss metric)

Click-Through Rate Prediction Feature Engineering

Learning Piece-wise Linear Models from Large Scale Data for Ad Click Prediction

shenweichen/DeepCTR 18 Apr 2017

CTR prediction in real-world business is a difficult machine learning problem with large scale nonlinear sparse data.

Click-Through Rate Prediction Feature Engineering

DeepFM: A Factorization-Machine based Neural Network for CTR Prediction

shenweichen/DeepCTR 13 Mar 2017

Learning sophisticated feature interactions behind user behaviors is critical in maximizing CTR for recommender systems.

Click-Through Rate Prediction Feature Engineering +1

Wide & Deep Learning for Recommender Systems

shenweichen/DeepCTR 24 Jun 2016

Memorization of feature interactions through a wide set of cross-product feature transformations are effective and interpretable, while generalization requires more feature engineering effort.

Click-Through Rate Prediction Feature Engineering +1