The Shanghaitech dataset is a large-scale crowd counting dataset. It consists of 1198 annotated crowd images. The dataset is divided into two parts, Part-A containing 482 images and Part-B containing 716 images. Part-A is split into train and test subsets consisting of 300 and 182 images, respectively. Part-B is split into train and test subsets consisting of 400 and 316 images. Each person in a crowd image is annotated with one point close to the center of the head. In total, the dataset consists of 330,165 annotated people. Images from Part-A were collected from the Internet, while images from Part-B were collected on the busy streets of Shanghai.
274 PAPERS • 7 BENCHMARKS
The ShanghaiTech Campus dataset has 13 scenes with complex light conditions and camera angles. It contains 130 abnormal events and over 270, 000 training frames. Moreover, both the frame-level and pixel-level ground truth of abnormal events are annotated in this dataset.
203 PAPERS • 8 BENCHMARKS
The UCF-Crime dataset is a large-scale dataset of 128 hours of videos. It consists of 1900 long and untrimmed real-world surveillance videos, with 13 realistic anomalies including Abuse, Arrest, Arson, Assault, Road Accident, Burglary, Explosion, Fighting, Robbery, Shooting, Stealing, Shoplifting, and Vandalism. These anomalies are selected because they have a significant impact on public safety.
138 PAPERS • 3 BENCHMARKS
The UCSD Anomaly Detection Dataset was acquired with a stationary camera mounted at an elevation, overlooking pedestrian walkways. The crowd density in the walkways was variable, ranging from sparse to very crowded. In the normal setting, the video contains only pedestrians. Abnormal events are due to either: the circulation of non pedestrian entities in the walkways anomalous pedestrian motion patterns Commonly occurring anomalies include bikers, skaters, small carts, and people walking across a walkway or in the grass that surrounds it. A few instances of people in wheelchair were also recorded. All abnormalities are naturally occurring, i.e. they were not staged for the purposes of assembling the dataset. The data was split into 2 subsets, each corresponding to a different scene. The video footage recorded from each scene was split into various clips of around 200 frames.
92 PAPERS • 4 BENCHMARKS
XD-Violence is a large-scale audio-visual dataset for violence detection in videos.
56 PAPERS • 1 BENCHMARK
Avenue Dataset contains 16 training and 21 testing video clips. The videos are captured in CUHK campus avenue with 30652 (15328 training, 15324 testing) frames in total.
48 PAPERS • 3 BENCHMARKS
UBnormal is a new supervised open-set benchmark composed of multiple virtual scenes for video anomaly detection. Unlike existing data sets, the data set introduces abnormal events annotated at the pixel level at training time, for the first time enabling the use of fully-supervised learning methods for abnormal event detection. To preserve the typical open-set formulation, the data set includes disjoint sets of anomaly types in the training and test collections of videos.
44 PAPERS • 4 BENCHMARKS
Street Scene is a dataset for video anomaly detection. Street Scene consists of 46 training and 35 testing high resolution 1280×720 video sequences taken from a USB camera overlooking a scene of a two-lane street with bike lanes and pedestrian sidewalks during daytime. The dataset is challenging because of the variety of activity taking place such as cars driving, turning, stopping and parking; pedestrians walking, jogging and pushing strollers; and bikers riding in bike lanes. In addition the videos contain changing shadows, moving background such as a flag and trees blowing in the wind, and occlusions caused by trees and large vehicles. There are a total of 56,847 frames for training and 146,410 frames for testing, extracted from the original videos at 15 frames per second. The dataset contains a total of 205 naturally occurring anomalous events ranging from illegal activities such as jaywalking and illegal U-turns to simply those that do not occur in the training set such as pets be
26 PAPERS • 3 BENCHMARKS
The human-Related version of the ShanghaiTech Campus, was first presented by Morais et al. in the paper "Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos".
14 PAPERS • 1 BENCHMARK
The GoodsAD dataset contains 6124 images with 6 categories of common supermarket goods. Each category contains multiple goods. All images are acquired with 3000 × 3000 high-resolution. The object locations in the images are not aligned. Most objects are in the center of the images and one image only contains a single object. Most anomalies occupy only a small fraction of image pixels. Both image-level and pixel-level annotations are provided.
13 PAPERS • 2 BENCHMARKS
Unlike previous datasets that focus on detecting the diversity of defect categories (like MVTec AD and VisA), AeBAD is centered on the diversity of domains within the same data category.
11 PAPERS • 3 BENCHMARKS
The human-Related version of the CUHK Avenue dataset, first presented by Morais et al. in the paper "Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos".
11 PAPERS • 1 BENCHMARK
CHAD: Charlotte Anomaly Dataset CHAD is high-resolution, multi-camera dataset for surveillance video anomaly detection. It includes bounding box, Re-ID, and pose annotations, as well as frame-level anomaly labels, dividing all frames into two groups of anomalous or normal. You can find the paper with all the details in the following link: CHAD: Charlotte Anomaly Dataset. Please refer to the page of the dataset for more information.
7 PAPERS • 1 BENCHMARK
The Human Related version of UBnormal ("UBnormal: New Benchmark for Supervised Open-Set Video Anomaly Detection," Acsintoae et al.) was introduced by Flaborea et al. in the paper "Contracting Skeletal Kinematics for Human-Related Video Anomaly Detection".
An abnormal activity data-set for research use that contains 4,83,566 annotated frames.
5 PAPERS • 2 BENCHMARKS
Hawk Annotation Dataset includes language descriptions specifically for anomaly scenes in seven existing video anomaly datasets. These seven datasets include a variety of anomalous scenarios such as crime (UCF-Cirme), campus (ShanghaiTech and CUHK Avenue), pedestrian walkways (UCSD Ped1 and Ped2), traffic (DoTA), and human behavior (UBnormal). With the support of these visual scenarios, this dataset can perform comprehensive fine-tuning for various abnormal scenarios, being closer to open-world scenarios.
1 PAPER • NO BENCHMARKS YET
This dataset focuses only on the robbery category, presenting a new weakly labelled dataset that contains 486 new real–world robbery surveillance videos acquired from public sources.
0 PAPER • NO BENCHMARKS YET