The Shanghaitech dataset is a large-scale crowd counting dataset. It consists of 1198 annotated crowd images. The dataset is divided into two parts, Part-A containing 482 images and Part-B containing 716 images. Part-A is split into train and test subsets consisting of 300 and 182 images, respectively. Part-B is split into train and test subsets consisting of 400 and 316 images. Each person in a crowd image is annotated with one point close to the center of the head. In total, the dataset consists of 330,165 annotated people. Images from Part-A were collected from the Internet, while images from Part-B were collected on the busy streets of Shanghai.
223 PAPERS • 3 BENCHMARKS
The UCSD Anomaly Detection Dataset was acquired with a stationary camera mounted at an elevation, overlooking pedestrian walkways. The crowd density in the walkways was variable, ranging from sparse to very crowded. In the normal setting, the video contains only pedestrians. Abnormal events are due to either: the circulation of non pedestrian entities in the walkways anomalous pedestrian motion patterns Commonly occurring anomalies include bikers, skaters, small carts, and people walking across a walkway or in the grass that surrounds it. A few instances of people in wheelchair were also recorded. All abnormalities are naturally occurring, i.e. they were not staged for the purposes of assembling the dataset. The data was split into 2 subsets, each corresponding to a different scene. The video footage recorded from each scene was split into various clips of around 200 frames.
69 PAPERS • 3 BENCHMARKS
UBI-Fights - Concerning a specific anomaly detection and still providing a wide diversity in fighting scenarios, the UBI-Fights dataset is a unique new large-scale dataset of 80 hours of video fully annotated at the frame level. Consisting of 1000 videos, where 216 videos contain a fight event, and 784 are normal daily life situations. All unnecessary video segments (e.g., video introductions, news, etc.) that could disturb the learning process were removed.
7 PAPERS • 2 BENCHMARKS
Description: Crying sound of 201 infants and young children aged 0~3 years old, a number of paragraphs from each of them; It provides data support for detecting children's crying sound in smart home projects.
1 PAPER • NO BENCHMARKS YET
Description: Laugh sound of 20 infants and young children aged 0~3 years old, a number of paragraphs from each of them; It provides data support for detecting children's laugh sound in smart home projects.
0 PAPER • NO BENCHMARKS YET
Description: 8,643 Images - 14 Types of Abnormal Images & Videos Data. The data includes indoor scenes (library, craft store, etc.) and outdoor scenes (road, building, square, railway station, etc.). The data diversity includes multiple scenes, 14 types of abnormal videos & images data, different light conditions, different image resolutions. The data can be used for tasks such as image deblurring and image denoising.