EGTEA (EGTEA Gaze+)

Introduced by Li et al. in In the Eye of Beholder: Joint Learning of Gaze and Actions in First Person Video

Extended GTEA Gaze+ EGTEA Gaze+ is a large-scale dataset for FPV actions and gaze. It subsumes GTEA Gaze+ and comes with HD videos (1280x960), audios, gaze tracking data, frame-level action annotations, and pixel-level hand masks at sampled frames. Specifically, EGTEA Gaze+ contains 28 hours (de-identified) of cooking activities from 86 unique sessions of 32 subjects. These videos come with audios and gaze tracking (30Hz). We have further provided human annotations of actions (human-object interactions) and hand masks.

The action annotations include 10325 instances of fine-grained actions, such as "Cut bell pepper" or "Pour condiment (from) condiment container into salad".

The hand annotations consist of 15,176 hand masks from 13,847 frames from the videos.

Source: http://cbs.ic.gatech.edu/fpv/

Homepage

Benchmarks

Add a new result Link an existing benchmark

Task	Dataset Variant	Best Model
Egocentric Activity Recognition	EGTEA	LaViLa
Long-tail Learning	EGTEA	CDB-loss
Action Anticipation	EGTEA	InAViT

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

EGTEA (EGTEA Gaze+)

Benchmarks

Add a new result Link an existing benchmark

Papers

Dataset Loaders

Add Remove

Tasks

Similar Datasets

ObMan-Ego

EPIC-KITCHENS-100

EPIC-KITCHENS-55

Usage

License

Modalities

Languages

EGTEA (EGTEA Gaze+)

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Similar Datasets

ObMan-Ego

EPIC-KITCHENS-100

EPIC-KITCHENS-55

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages