Dataset Generation

116 papers with code • 0 benchmarks • 0 datasets

The task involves enhancing the training of target application (e.g. autonomous driving systems) by generating datasets of diverse and critical elements (e.g. traffic scenarios). Traditional methods rely on expensive and limited datasets, which often fail to capture rare but essential situations that can pose risks during testing.

Most implemented papers

Rethinking Table Recognition using Graph Neural Networks

shahrukhqasim/TIES-2.0 31 May 2019

In this paper, we propose an architecture based on graph networks as a better alternative to standard neural networks for table recognition.

Segmenting Unknown 3D Objects from Real Depth Images using Mask R-CNN Trained on Synthetic Data

BerkeleyAutomation/sd-maskrcnn 16 Sep 2018

We train a variant of Mask R-CNN with domain randomization on the generated dataset to perform category-agnostic instance segmentation without any hand-labeled data and we evaluate the trained network, which we refer to as Synthetic Depth (SD) Mask R-CNN, on a set of real, high-resolution depth images of challenging, densely-cluttered bins containing objects with highly-varied geometry.

ZeroGen: Efficient Zero-shot Learning via Dataset Generation

HKUNLP/zerogen 16 Feb 2022

There is a growing interest in dataset generation recently due to the superior generative capacity of large pre-trained language models (PLMs).

Scrape, Cut, Paste and Learn: Automated Dataset Generation Applied to Parcel Logistics

a-nau/synthetic-dataset-generation 18 Oct 2022

This approach of image scraping and selection relaxes the need for a real-world domain-specific dataset that must be either publicly available or created for this purpose.

Learning-based NLOS Detection and Uncertainty Prediction of GNSS Observations with Transformer-Enhanced LSTM Network

rwth-irt/deepnlosdetection 1 Sep 2023

This work proposes a deep-learning-based method to detect NLOS receptions and predict GNSS pseudorange errors by analyzing GNSS observations as a spatio-temporal modeling problem.

Masked Face Dataset Generation and Masked Face Recognition

luisrui/seeing-ai-system 13 Nov 2023

In the post-pandemic era, wearing face masks has posed great challenge to the ordinary face recognition.

Cephalo: Multi-Modal Vision-Language Models for Bio-Inspired Materials Analysis and Design

lamm-mit/Cephalo-Phi-3-MoE 29 May 2024

We present Cephalo, a series of multimodal vision large language models (V-LLMs) designed for materials science applications, integrating visual and linguistic data for enhanced understanding.

Affordance Learning for End-to-End Visuomotor Robot Control

gamleksi/affordancegym 10 Mar 2019

Training end-to-end deep robot policies requires a lot of domain-, task-, and hardware-specific data, which is often costly to provide.

LeagueAI: Improving object detector performance and flexibility through automatically generated training data and domain randomization

Oleffa/LeagueAI 28 May 2019

In an experiment I compared a model trained on synthetic data to a model trained on hand labeled data and a model trained on a combined dataset.

Smart Home Appliances: Chat with Your Fridge

gudovskiy/beta-fridge 19 Dec 2019

Current home appliances are capable to execute a limited number of voice commands such as turning devices on or off, adjusting music volume or light conditions.