Scaling Egocentric Vision: The EPIC-KITCHENS Dataset

epic-kitchens/annotations ECCV 2018

First-person vision is gaining interest as it offers a unique viewpoint on people's interaction with objects, their attention, and even intention.

Rescaling Egocentric Vision

fpv-iplab/rulstm 23 Jun 2020

This paper introduces the pipeline to scale the largest dataset in egocentric vision EPIC-KITCHENS.

Rolling-Unrolling LSTMs for Action Anticipation from First-Person Video

fpv-iplab/rulstm 4 May 2020

The experiments show that the proposed architecture is state-of-the-art in the domain of egocentric videos, achieving top performances in the 2019 EPIC-Kitchens egocentric action anticipation challenge.

Pedestrian Action Anticipation using Contextual Feature Fusion in Stacked RNNs

aras62/SF-GRU 13 May 2020

To this end, we propose a solution for the problem of pedestrian action anticipation at the point of crossing.

Video Representation Learning with Visual Tempo Consistency

decisionforce/VTHCL 28 Jun 2020

Visual tempo, which describes how fast an action goes, has shown its potential in supervised action recognition.

RED: Reinforced Encoder-Decoder Networks for Action Anticipation

rajskar/CS763Project 16 Jul 2017

RED takes multiple history representations as input and learns to anticipate a sequence of future representations.

Encouraging LSTMs to Anticipate Actions Very Early

mangalutsav/Multi-Stage-LSTM-for-Action-Anticipation ICCV 2017

In contrast to the widely studied problem of recognizing an action given a complete sequence, action anticipation aims to identify the action from only partially available videos.

Forecasting Human-Object Interaction: Joint Prediction of Motor Attention and Actions in First Person Video

2020aptx4869lm/Forecasting-Human-Object-Interaction-in-FPV ECCV 2020

Motivated by this, we adopt intentional hand movement as a future representation and propose a novel deep network that jointly models and predicts the egocentric hand motion, interaction hotspots and future action.

Higher Order Recurrent Space-Time Transformer

CorcovadoMing/HORST 17 Apr 2021

Endowing visual agents with predictive capability is a key step towards video intelligence at scale.

