Synthetic Data Generation

180 papers with code • 1 benchmarks • 5 datasets

The generation of tabular data by any means possible.


Use these libraries to find Synthetic Data Generation models and implementations

Most implemented papers

RLIP: Relational Language-Image Pre-training for Human-Object Interaction Detection

jacobyuan7/rlip 5 Sep 2022

The task of Human-Object Interaction (HOI) detection targets fine-grained visual parsing of humans interacting with their environment, enabling a broad range of applications.

Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to Parcel Logistics

a-nau/synthetic-dataset-generation 18 Oct 2022

This approach of image scraping and selection relaxes the need for a real-world domain-specific dataset that must be either publicly available or created for this purpose.

Characterization and Greedy Learning of Gaussian Structural Causal Models under Unknown Interventions

juangamella/sempler 27 Nov 2022

We leverage this procedure and evaluate the performance of GnIES on synthetic, real, and semi-synthetic data sets.

Scenic: A Language for Scenario Specification and Scene Generation

BerkeleyLearnVerify/Scenic 25 Sep 2018

We propose a new probabilistic programming language for the design and analysis of perception systems, especially those based on machine learning.

UnrealROX: An eXtremely Photorealistic Virtual Reality Environment for Robotics Simulations and Synthetic Data Generation

3dperceptionlab/unrealrox 16 Oct 2018

Gathering and annotating that sheer amount of data in the real world is a time-consuming and error-prone task.

Privacy-preserving data sharing via probabilistic modelling

DPBayes/data-sharing-examples 10 Dec 2019

Differential privacy allows quantifying privacy loss resulting from accessing sensitive personal data.

SoftAdapt: Techniques for Adaptive Loss Weighting of Neural Networks with Multi-Part Loss Functions

dr-aheydari/SoftAdapt 27 Dec 2019

Adaptive loss function formulation is an active area of research and has gained a great deal of popularity in recent years, following the success of deep learning.

Exploring Transformer Text Generation for Medical Dataset Augmentation

amin-nejad/mimic-website LREC 2020

Natural Language Processing (NLP) can help unlock the vast troves of unstructured data in clinical text and thus improve healthcare research.

MTSS-GAN: Multivariate Time Series Simulation Generative Adversarial Networks

firmai/mtss-gan The Alan Turing Institute 2020

MTSS-GAN is a new generative adversarial network (GAN) developed to simulate diverse multivariate time series (MTS) data with finance applications in mind.

Scenic: A Language for Scenario Specification and Data Generation

BerkeleyLearnVerify/Scenic 13 Oct 2020

We design a domain-specific language, Scenic, for describing scenarios that are distributions over scenes and the behaviors of their agents over time.