…The dataset offers high quality, pixel-level segmentations of hands the possibility to semantically distinguish between the observer’s hands and someone else’s hands, as well as left and right hands
30 PAPERS • NO BENCHMARKS YET
…More specifically, the dataset contains 50 hours of annotated videos to localize relevant animal behavior segments in long videos for the video grounding task, 30K video sequences for the fine-grained
15 PAPERS • 2 BENCHMARKS
…A case was composed of kinematic data, a video, semantic segmentation of each frame, and workflow annotation.
3 PAPERS • 6 BENCHMARKS
…We benchmark four foundational video understanding tasks: action recognition, action segmentation, object detection and multi-object tracking.
1 PAPER • NO BENCHMARKS YET