4 dataset results for segmentation AND Temporal Action Localization

…Each video is labelled with 3.91 step segments, where each segment lasts 14.91 seconds on average. In total, the dataset contains videos of 476 hours, with 46,354 annotated segments.

78 PAPERS • 2 BENCHMARKS

IKEA ASM

A three million frame, multi-view, furniture assembly video dataset that includes depth, atomic actions, object segmentation, and human pose.

16 PAPERS • NO BENCHMARKS YET

Perception Test

…The videos are densely annotated with six types of labels: object and point tracks, temporal action and sound segments, multiple-choice video question-answers and grounded video question-answers.

4 PAPERS • NO BENCHMARKS YET

EPIC-KITCHENS-100

…EPIC-KITCHENS-55), EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments

134 PAPERS • 7 BENCHMARKS

Datasets

4 dataset results for segmentation AND Temporal Action Localization