CAR contains visual attributes for objects in the Cityscapes dataset. For each object in an image, we have a list of attributes that depend on the category of the object.
2 PAPERS • NO BENCHMARKS YET
The Talk2Car dataset finds itself at the intersection of various research domains, promoting the development of cross-disciplinary solutions for improving the state-of-the-art in grounding natural language The Talk2Car dataset was build on top of the nuScenes dataset to include an extensive suite of sensor modalities, i.e. semantic maps, GPS, LIDAR, RADAR and 360-degree RGB images annotated with 3D bounding Such variety of input modalities sets the object referral task on the Talk2Car dataset apart from related challenges, where additional sensor modalities are generally missing.
36 PAPERS • 1 BENCHMARK
EyeCar is a dataset of driving videos of vehicles involved in rear-end collisions paired with eye fixation data captured from human subjects.
We provide manual annotations of 14 semantic keypoints for 100,000 car instances (sedan, suv, bus, and truck) from 53,000 images captured from 18 moving cameras at Multiple intersections in Pittsburgh,
8 PAPERS • 2 BENCHMARKS
FETA benchmark focuses on text-to-image and image-to-text retrieval in public car manuals and sales catalogue brochures. The FETA Car-Manuals dataset consists of a total of 349 PDF documents from 5 car manufacturers, namely Nissan, Toyota, Mazda, Renault, Chevrolet.
1 PAPER • 2 BENCHMARKS
This dataset is a collection of 4,000 images of cars in multiple scenes that are ready to use for optimizing the accuracy of computer vision models.
0 PAPER • NO BENCHMARKS YET
The MuSe-CAR database is a large, multimodal (video, audio, and text) dataset which has been gathered in-the-wild with the intention of further understanding Multimodal Sentiment Analysis in-the-wild,
8 PAPERS • NO BENCHMARKS YET
ApolloCar3DT is a dataset that contains 5,277 driving images and over 60K car instances, where each car is fitted with an industry-grade 3D CAD model with absolute model size and semantically labelled
17 PAPERS • 14 BENCHMARKS
The Oxford RobotCar Dataset contains over 100 repetitions of a consistent route through Oxford, UK, captured over a period of over a year.
38 PAPERS • 3 BENCHMARKS
The Oxford Radar RobotCar Dataset is a radar extension to The Oxford RobotCar Dataset. Millimetre-Wave FMCW radar and Dual Velodyne HDL-32E LIDARs with optimised ground truth radar odometry for 280 km of driving around Oxford, UK (in addition to all sensors in the original Oxford RobotCar
14 PAPERS • 2 BENCHMARKS
Car crash dataset RUSSIA 2022-2023 is a big driving video dataset that contains over 500 high-resolution videos of various driving scenarios. The videos are annotated with bounding boxes around objects such as different types of cars, pedestrians, and cyclists, as well as traffic signs, and traffic lights. Additionally, the dataset includes metadata information for each video.Car crash dataset RUSSIA 2022-2023 is considered to be one of the few datasets from Russia on this topic.
This dataset consists of odometer or speedometer images of bike and car vehicles. Introduction This dataset can be used to detect or recognize odometer readings of the vehicles. Moreover, it can be used to classify the make of the car and bikes. The usecases can in the domain of insurance, repair and OCR.
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
1 PAPER • NO BENCHMARKS YET
Car Crash Dataset (CCD) is collected for traffic accident analysis.
10 PAPERS • 1 BENCHMARK
Form Understanding in Noisy Scanned Documents (FUNSD) comprises 199 real, fully annotated, scanned forms. The documents are noisy and vary widely in appearance, making form understanding (FoUn) a challenging task.
146 PAPERS • 3 BENCHMARKS
The Car Parking Lot Dataset (CARPK) contains nearly 90,000 cars from 4 different parking lots collected by means of drone (PHANTOM 3 PROFESSIONAL). The image set is annotated by bounding box per car. All labeled bounding boxes have been well recorded with the top-left points and the bottom-right points.
62 PAPERS • 1 BENCHMARK
CARLA (CAR Learning to Act) is an open simulator for urban driving, developed as an open-source layer over Unreal Engine 4.
1,088 PAPERS • 3 BENCHMARKS
TORCS (The Open Racing Car Simulator) is a driving simulator. TORCS offers a large variety of tracks and cars as free assets. It also provides a number of programmed robot cars with different levels of performance that can be used to benchmark the performance of human players and software driving agents.
91 PAPERS • NO BENCHMARKS YET
DACT contains two subsets of annotated car trajectories data. The dataset contains 50 trajectories which cover about 13 hours of driving data.
3 PAPERS • NO BENCHMARKS YET
Introduction NBMOD is a dataset created for researching the task of specific object grasp detection by robots in noisy environments. The dataset comprises three subsets: Simple background Single-object Subset (SSS), Noisy background Single-object Subset (NSS), and Multi-Object grasp detection Subset (MOS).
1 PAPER • 1 BENCHMARK
VoiceBank+DEMAND is a noisy speech database for training speech enhancement algorithms and TTS models. The database was designed to train and test speech enhancement methods that operate at 48kHz.
34 PAPERS • 1 BENCHMARK
Social Media User Sentiment Analysis Dataset. Each user comments are labeled with either positive (1), negative (2), or neutral (0).
11 PAPERS • NO BENCHMARKS YET
TICaM is a Time-of-flight In-car Cabin Monitoring dataset for vehicle interior monitoring using a single wide-angle depth camera. This dataset addresses the deficiencies of other available in-car cabin datasets in terms of the ambit of labeled classes, recorded scenarios and provided annotations; all at the same time. Additional to real recordings, it also contains a synthetic dataset of in-car cabin images with same multi-modality of images and annotations, providing a unique and extremely beneficial combination of
5 PAPERS • NO BENCHMARKS YET
This dataset, based on Flickr30K, is introduced in . Results are averaged over 5 folds of 1K test images as in this paper.
…This dataset contains extracted and enhanced two categories of car-following data, HV-following-AV (H-A) and HV-following-HV (H-H), from the open Lyft level-5 dataset.
Detecting Multi-labeled Emotion for 6 emotion categories, namely Love, Joy, Surprise, Anger, Sadness, Fear.
…It consists of expert policies that are trained to track individual clip snippets and HDF5 files of noisy rollouts collected from each expert, including proprioceptive observations and actions.
DrivAerNet is a large-scale, high-fidelity CFD dataset of 3D industry-standard car shapes designed for data-driven aerodynamic design. It comprises 4000 high-quality 3D car meshes and their corresponding aerodynamic performance coefficients, alongside full 3D flow field information. Curated CFD Simulations: For ease of access and use, a streamlined version of the CFD simulation data is provided, refined to include key insights and data, reducing the size to $\sim$ 1TB. 3D Car Meshes : A total of 4000 designs, showcasing a variety of conventional car shapes and emphasizing the impact of minor geometric modifications on aerodynamic efficiency. The 3D meshes and aerodynamic coefficients $\sim$ 84GB. 2D slices include the car's wake in the $x$-direction and the symmetry plane in the $y$-direction $\sim$ 12GB.
The FSDnoisy18k dataset is an open dataset containing 42.5 hours of audio across 20 sound event classes, including a small amount of manually-labeled data and a larger quantity of real-world noisy data The noisy set of FSDnoisy18k consists of 15,813 audio clips (38.8h), and the test set consists of 947 audio clips (1.4h) with correct labels. IV applies when, given an observed label that is incorrect or incomplete, the true or missing label is part of the target class set.
18 PAPERS • NO BENCHMARKS YET
SIDD is an image denoising dataset containing 30,000 noisy images from 10 scenes under different lighting conditions using five representative smartphone cameras. Ground truth images are provided along with the noisy images.
209 PAPERS • 2 BENCHMARKS
…The dataset was generated by Amazon Mechanical Turk workers in the following process (an example is provided in parentheses): a crowd worker observes a source concept from ConceptNet (“River”) and three for each question, another worker chooses one additional distractor from Concept Net (“pebble”, “stream”, “bank”), and the author another distractor (“mountain”, “bottom”, “island”) manually.
353 PAPERS • 1 BENCHMARK
The Comprehensive Cars (CompCars) dataset contains data from two scenarios, including images from web-nature and surveillance-nature. The web-nature data contains 163 car makes with 1,716 car models. There are a total of 136,726 images capturing the entire cars and 27,618 images capturing the car parts. The full car images are labeled with bounding boxes and viewpoints. Each car model is labeled with five attributes, including maximum speed, displacement, number of doors, number of seats, and type of car. The surveillance-nature data contains 50,000 car images captured in the front view. The dataset can be used for the tasks of: Fine-grained classification Attribute prediction Car model verification The dataset can be also used for other tasks such as image ranking, multi-task learning
66 PAPERS • 1 BENCHMARK
The Machine Translation of Noisy Text (MTNT) dataset is a Machine Translation dataset that consists of noisy comments on Reddit and professionally sourced translation.
51 PAPERS • NO BENCHMARKS YET
…The content images are mostly the photorealistic scenes of mountain, lake, river, bridge, and buildings in regions south of the Yangtze River.
Part of the Controlled Noisy Web Labels Dataset.
5 PAPERS • 2 BENCHMARKS
…Each image is paired with a social event category among the following: orig_class_names = ["concert", "graduation", "meeting", "mountain-trip", "picnic", "sea-holiday", "ski-holiday", "
…Therefore, by aggregating gazes of independent observers, we could record multiple important visual cues in one frame. By averaging the eye movements of independent observers, we were able to effectively wash out those sources of noise (see Fig. 2B). Comparison with In-Car Attention Data: We collected in-lab driver attention data using videos from the DR(eye)VE dataset. This allowed us to compare in-lab and in-car attention maps of each video. It may be that the human observers in our in-lab eye-tracking experiment also looked at objects that were not relevant for driving. We ran a human evaluation experiment to address this concern. This result proves that our dataset can also serve as a substitute for in-car driver attention data, especially in crucial situations where in-car data collection is not practical.
21 PAPERS • NO BENCHMARKS YET
CARS196 is composed of 16,185 car images of 196 classes.
41 PAPERS • 4 BENCHMARKS
RGB-D dataset of synthetic indoor scenes with color, noisy depth map, etc.
23 PAPERS • NO BENCHMARKS YET
The Cars Overhead With Context (COWC) data set is a large set of annotated cars from overhead. It is useful for training a device such as a deep neural network to learn to detect and/or count cars.
UrbanCars facilitates multi-shortcut learning under the controlled setting with two shortcuts—background and co-occurring object. The task is classifying the car body type into two categories: urban car and country car. The dataset contains three splits: training, validation, and testing. In the training set, two shortcuts spuriously correlate with the car body type. Both validation and testing sets are balanced, i.e., no spurious correlations.
18 PAPERS • 1 BENCHMARK
…PointDenoisingBenchmark for outliers removal: contains noisy point clouds with different levels of gaussian noise and the corresponding clean ground truths. PointDenoisingBenchmark for denoising: contains noisy point clouds with different levels of noise and density of outliers and the corresponding clean ground truths.
Cantonese In-car Audio-Visual Speech Recognition (CI-AVSR) is a dataset for in-car command recognition in the Cantonese language with both video and audio data. It consists of 4,984 samples (8.3 hours) of 200 in-car commands recorded by 30 native Cantonese speakers. Furthermore, the dataset is augmented using common in-car background noises to simulate real environments, producing a dataset 10 times larger than the collected one.
The CRVD dataset consists of 55 groups of noisy-clean videos with ISO values ranging from 1600 to 25600.
19 PAPERS • 1 BENCHMARK
…These include observations from both static and dynamic sensors, a varying number of moving bodies, and a variety of different 3D motions. The dataset culminates in a complex toy car segment representative of many challenging real-world scenarios.
A dataset of ranked scan-CAD similarity annotations, enabling new, fine-grained evaluation of CAD model retrieval to cluttered, noisy, partial scans.