12 papers with code • 2 benchmarks • 3 datasets
First-person vision is gaining interest as it offers a unique viewpoint on people's interaction with objects, their attention, and even intention.
The experiments show that the proposed architecture is state-of-the-art in the domain of egocentric videos, achieving top performances in the 2019 EPIC-Kitchens egocentric action anticipation challenge.
Our method is ranked first in the public leaderboard of the EPIC-Kitchens egocentric action anticipation challenge 2019.
Ranked #2 on Egocentric Activity Recognition on EPIC-KITCHENS-55
To this end, we propose a solution for the problem of pedestrian action anticipation at the point of crossing.
RED takes multiple history representations as input and learns to anticipate a sequence of future representations.
In contrast to the widely studied problem of recognizing an action given a complete sequence, action anticipation aims to identify the action from only partially available videos.
Motivated by this, we adopt intentional hand movement as a future representation and propose a novel deep network that jointly models and predicts the egocentric hand motion, interaction hotspots and future action.