3 dataset results for segmentation AND Video Understanding AND Videos

…The dataset consists of two components: segmented videos for activity recognition and continuous videos for activity classification.

5 PAPERS • NO BENCHMARKS YET

EPIC-KITCHENS-55

…Each video is split into short action segments (mean duration is 3.7s) with specific start and end times and a verb and noun annotation describing the action (e.g. ‘open fridge‘).

37 PAPERS • 3 BENCHMARKS

AVA (Atomic Visual Actions)

…AVA Speech densely annotates audio-based speech activity in AVA v1.0 videos, and explicitly labels 3 background noise conditions, resulting in ~46K labeled segments spanning 45 hours of data.

98 PAPERS • 7 BENCHMARKS

Datasets

3 dataset results for segmentation AND Video Understanding AND Videos