Synthetic Data Generation

180 papers with code • 1 benchmarks • 5 datasets

The generation of tabular data by any means possible.

Libraries

Use these libraries to find Synthetic Data Generation models and implementations

Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models

algoprog/syntod 23 Apr 2024

In our experiments, using graph-guided response simulations leads to significant improvements in intent classification, slot filling and response relevance compared to naive single-prompt simulated conversations.

3
23 Apr 2024

Better Synthetic Data by Retrieving and Transforming Existing Datasets

neulab/prompt2model 22 Apr 2024

Recent work has studied prompt-driven synthetic data generation using large language models, but these generated datasets tend to lack complexity and diversity.

1,861
22 Apr 2024

Aligning Actions and Walking to LLM-Generated Textual Descriptions

radu1999/walkandtext 18 Apr 2024

For action recognition, we employ LLMs to generate textual descriptions of actions in the BABEL-60 dataset, facilitating the alignment of motion sequences with linguistic representations.

0
18 Apr 2024

An evaluation framework for synthetic data generation models

novelcore/synthetic_data_evaluation_framework 13 Apr 2024

Two use case scenarios demonstrate the applicability of the proposed framework for evaluating the ability of synthetic data generation models to generated high quality data.

1
13 Apr 2024

Towards Algorithmic Fidelity: Mental Health Representation across Demographics in Synthetic vs. Human-generated Data

michigannlp/depression_synthetic_data 25 Mar 2024

Using GPT-3, we develop HEADROOM, a synthetic dataset of 3, 120 posts about depression-triggering stressors, by controlling for race, gender, and time frame (before and after COVID-19).

2
25 Mar 2024

SYNCS: Synthetic Data and Contrastive Self-Supervised Training for Central Sulcus Segmentation

vivikar/central-sulcus-analysis 22 Mar 2024

Identifying risk markers early is crucial for understanding disease progression and enabling preventive measures.

0
22 Mar 2024

Joint Selection: Adaptively Incorporating Public Information for Private Synthetic Data

miguel-fuentes/jam_aistats 12 Mar 2024

This technique allows for public data to be included in a graphical-model-based mechanism.

0
12 Mar 2024

Synthetic data generation for system identification: leveraging knowledge transfer from similar systems

dariopi/synthetic_data_generation 8 Mar 2024

This paper addresses the challenge of overfitting in the learning of dynamical systems by introducing a novel approach for the generation of synthetic data, aimed at enhancing model generalization and robustness in scenarios characterized by data scarcity.

3
08 Mar 2024

IR2: Information Regularization for Information Retrieval

info-regularization/information-regularization 25 Feb 2024

This approach, representing a novel application of regularization techniques in synthetic data creation for IR, is tested on three recent IR tasks characterized by complex queries: DORIS-MAE, ArguAna, and WhatsThatBook.

3
25 Feb 2024

Synthetic location trajectory generation using categorical diffusion models

irmlma/mobility-simulation-cdpm 19 Feb 2024

Diffusion probabilistic models (DPMs) have rapidly evolved to be one of the predominant generative models for the simulation of synthetic data, for instance, for computer vision, audio, natural language processing, or biomolecule generation.

2
19 Feb 2024