…Each video is labelled with 3.91 step segments, where each segment lasts 14.91 seconds on average. In total, the dataset contains videos of 476 hours, with 46,354 annotated segments.
78 PAPERS • 2 BENCHMARKS
…Sequences are annotated with more than 100K coarse and 1M fine-grained action segments, and 18M 3D hand poses. We benchmark on three action understanding tasks: recognition, anticipation and temporal segmentation. Additionally, we propose a novel task of detecting mistakes.
38 PAPERS • 4 BENCHMARKS
…We benchmark four foundational video understanding tasks: action recognition, action segmentation, object detection and multi-object tracking.
1 PAPER • NO BENCHMARKS YET
…Sample videos of the JIGSAWS tasks can be downloaded from the official webpage. manual annotations including: gesture (atomic surgical activity segment labels). skill (global rating score using modified
91 PAPERS • 3 BENCHMARKS