EPIC-KITCHENS-55

Introduced by Damen et al. in Scaling Egocentric Vision: The EPIC-KITCHENS Dataset

The EPIC-KITCHENS-55 dataset comprises a set of 432 egocentric videos recorded by 32 participants in their kitchens at 60fps with a head mounted camera. There is no guiding script for the participants who freely perform activities in kitchens related to cooking, food preparation or washing up among others. Each video is split into short action segments (mean duration is 3.7s) with specific start and end times and a verb and noun annotation describing the action (e.g. ‘open fridge‘). The verb classes are 125 and the noun classes 331. The dataset is divided into one train and two test splits.

Source: Egocentric Hand Track and Object-based Human Action Recognition

Homepage

Benchmarks

Add a new result Link an existing benchmark

Task	Dataset Variant	Best Model
Egocentric Activity Recognition	EPIC-KITCHENS-55	DEEP-HAL with ODF+SDF
Action Recognition	EPIC-KITCHENS-55	TSM+W3 - full res
Video Object Detection	EPIC-KITCHENS-55	Ours