Feature Engineering

393 papers with code • 1 benchmarks • 5 datasets

Feature engineering is the process of taking a dataset and constructing explanatory variables — features — that can be used to train a machine learning model for a prediction problem. Often, data is spread across multiple tables and must be gathered into a single table with rows containing the observations and features in the columns.

The traditional approach to feature engineering is to build features one at a time using domain knowledge, a tedious, time-consuming, and error-prone process known as manual feature engineering. The code for manual feature engineering is problem-dependent and must be re-written for each new dataset.

Libraries

Use these libraries to find Feature Engineering models and implementations
6 papers
7,363
6 papers
795
6 papers
314
See all 12 libraries.

Subtasks


Latest papers with no code

Generic Multi-modal Representation Learning for Network Traffic Analysis

no code yet • 4 May 2024

The result is a flexible Multi-modal Autoencoder (MAE) pipeline that can solve different use cases.

Explainable Automatic Grading with Neural Additive Models

no code yet • 1 May 2024

The use of automatic short answer grading (ASAG) models may help alleviate the time burden of grading while encouraging educators to frequently incorporate open-ended items in their curriculum.

Diagnosis of Parkinson's Disease Using EEG Signals and Machine Learning Techniques: A Comprehensive Study

no code yet • 30 Apr 2024

Our approach incorporates a comprehensive review of EEG signal analysis techniques and machine learning methods.

Enhancing IoT Security: A Novel Feature Engineering Approach for ML-Based Intrusion Detection Systems

no code yet • 29 Apr 2024

The integration of Internet of Things (IoT) applications in our daily lives has led to a surge in data traffic, posing significant security challenges.

LEMDA: A Novel Feature Engineering Method for Intrusion Detection in IoT Systems

no code yet • 20 Apr 2024

Feature engineering can solve these issues; hence, it has become critical for IDS in large-scale IoT systems to reduce the size and dimensionality of data, resulting in less complex models with excellent performance, smaller data storage, and fast detection.

Large Language Models for Networking: Workflow, Advances and Challenges

no code yet • 19 Apr 2024

The networking field is characterized by its high complexity and rapid iteration, requiring extensive expertise to accomplish network tasks, ranging from network design, configuration, diagnosis and security.

TIMIT Speaker Profiling: A Comparison of Multi-task learning and Single-task learning Approaches

no code yet • 18 Apr 2024

This study employs deep learning techniques to explore four speaker profiling tasks on the TIMIT dataset, namely gender classification, accent classification, age estimation, and speaker identification, highlighting the potential and challenges of multi-task learning versus single-task models.

PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network

no code yet • 16 Apr 2024

In this study, we propose PreGSU, a generalized pre-trained scene understanding model based on graph attention network to learn the universal interaction and reasoning of traffic scenes to support various downstream tasks.

Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification

no code yet • 16 Apr 2024

Light curves serve as a valuable source of information on stellar formation and evolution.

Survey on Embedding Models for Knowledge Graph and its Applications

no code yet • 14 Apr 2024

Knowledge Graph (KG) is a graph based data structure to represent facts of the world where nodes represent real world entities or abstract concept and edges represent relation between the entities.