AVA is a project that provides audiovisual annotations of video for improving our understanding of human activity. Each of the video clips has been exhaustively annotated by human annotators, and together they represent a rich variety of scenes, recording conditions, and expressions of human activity. There are annotations for:
38 PAPERS • 2 BENCHMARKS
Contains densely labeled speech activity in YouTube videos, with the goal of creating a shared, available dataset for this task.
6 PAPERS • NO BENCHMARKS YET
Large-scale dataset for human activity recognition. Existing security datasets either focus on activity counts by aggregating public video disseminated due to its content, which typically excludes same-scene background video, or they achieve persistence by observing public areas and thus cannot control for activity content. The dataset is over 9300 hours of untrimmed, continuous video, scripted to include diverse, simultaneous activities, along with spontaneous background activity.
6 PAPERS • NO BENCHMARKS YET
The MLB-YouTube dataset is a new, large-scale dataset consisting of 20 baseball games from the 2017 MLB post-season available on YouTube with over 42 hours of video footage. The dataset consists of two components: segmented videos for activity recognition and continuous videos for activity classification. It is quite challenging as it is created from TV broadcast baseball games where multiple different activities share the camera angle. Further, the motion/appearance difference between the various activities is quite small.
5 PAPERS • NO BENCHMARKS YET
Toyota Smarthome Untrimmed (TSU) is a dataset for activity detection in long untrimmed videos. The dataset contains 536 videos with an average duration of 21 mins. Since this dataset is based on the same footage video as Toyota Smarthome Trimmed version, it features the same challenges and introduces additional ones. The dataset is annotated with 51 activities.
4 PAPERS • 1 BENCHMARK
A large scale dataset with daily-living activities performed in a natural manner.
3 PAPERS • NO BENCHMARKS YET
An abnormal activity data-set for research use that contains 4,83,566 annotated frames.
2 PAPERS • NO BENCHMARKS YET
Home Action Genome is a large-scale multi-view video database of indoor daily activities. Every activity is captured by synchronized multi-view cameras, including an egocentric view. There are 30 hours of vides with 70 classes of daily activities and 453 classes of atomic actions.
1 PAPER • 1 BENCHMARK
40,764 images (11,659 protest images and hard negatives) with various annotations of visual attributes and sentiments.
1 PAPER • NO BENCHMARKS YET
DAHLIA dataset  is devoted to human activity recognition, which is a major issue for adapting smart-home services such as user assistance. DAHLIA has been realized in Mobile Mii Platform by CEA LIST, and has been partly supported by ITEA 3 Emospaces Project (https://itea3.org/project/emospaces.html)
0 PAPER • NO BENCHMARKS YET