Tabular Data Generation
29 papers with code • 6 benchmarks • 6 datasets
Generation of the tabular data using generative models
Libraries
Use these libraries to find Tabular Data Generation models and implementationsMost implemented papers
Modeling Tabular data using Conditional GAN
Tabular data usually contains a mix of discrete and continuous columns.
Generating and Imputing Tabular Data via Diffusion and Flow-based Gradient-Boosted Trees
Through empirical evaluation across the benchmark, we demonstrate that our approach outperforms deep-learning generation methods in data generation tasks and remains competitive in data imputation.
Turning the Tables: Biased, Imbalanced, Dynamic Tabular Datasets for ML Evaluation
The suite was generated by applying state-of-the-art tabular data generation techniques on an anonymized, real-world bank account opening fraud detection dataset.
Reimagining Synthetic Tabular Data Generation through Data-Centric AI: A Comprehensive Benchmark
In an empirical study, we evaluate the performance of five state-of-the-art models for tabular data generation on eleven distinct tabular datasets.
Tabular GANs for uneven distribution
GANs are well known for success in the realistic image generation.
TabFairGAN: Fair Tabular Data Generation with Generative Adversarial Networks
In the unconstrained case, i. e. when the model is only trained in the first phase and is only meant to generate accurate data following the same joint probability distribution of the real data, the results show that the model beats state-of-the-art GANs proposed in the literature to produce synthetic tabular data.
DATGAN: Integrating expert knowledge into deep learning for synthetic tabular data
We show that the best versions of the DATGAN outperform state-of-the-art generative models on multiple case studies.
ConvGeN: Convex space learning improves deep-generative oversampling for tabular imbalanced classification on smaller datasets
Moreover, we discuss how our model can be used for synthetic tabular data generation in general, even outside the scope of data imbalance and thus, improves the overall applicability of convex space learning.
TabSynDex: A Universal Metric for Robust Evaluation of Synthetic Tabular Data
We present several baseline models for comparative analysis of the proposed evaluation metric with existing generative models.
Language Models are Realistic Tabular Data Generators
Tabular data is among the oldest and most ubiquitous forms of data.