Action Anticipation

34 papers with code • 6 benchmarks • 8 datasets

Next action anticipation is defined as observing 1, ... , T frames and predicting the action that happens after a gap of T_a seconds. It is important to note that a new action starts after T_a seconds that is not seen in the observed frames. Here T_a=1 second.

Benchmarks

Add a Result

These leaderboards are used to track progress in Action Anticipation

Dataset	Best Model	Compare
EPIC-KITCHENS-100 (test)	InAViT	See all
EPIC-KITCHENS-55 (Seen test set (S1))	Abstract Goal	See all
EPIC-KITCHENS-55 (Unseen test set (S2)	Abstract Goal	See all
EPIC-KITCHENS-100	InAViT	See all
Assembly101	Goal Consistency	See all
EGTEA	InAViT	See all

Datasets

Latest papers

Most implemented Social Latest No code

Action Scene Graphs for Long-Form Understanding of Egocentric Videos

fpv-iplab/easg • • 6 Dec 2023

We present Egocentric Action Scene Graphs (EASGs), a new representation for long-form understanding of egocentric videos.

06 Dec 2023

Paper
Code

Object-centric Video Representation for Long-term Action Anticipation

brown-palm/ObjectPrompt • • 31 Oct 2023

To recognize and predict human-object interactions, we use a Transformer-based neural architecture which allows the "retrieval" of relevant objects for action anticipation at various time scales.

31 Oct 2023

Paper
Code

Palm: Predicting Actions through Language Models @ Ego4D Long-Term Action Anticipation Challenge 2023

dandoge/palm • • 28 Jun 2023

We present Palm, a solution to the Long-Term Action Anticipation (LTA) task utilizing vision-language and large language models.

28 Jun 2023

Paper
Code

Action Anticipation with Goal Consistency

olga-zats/goal_consistency • • 26 Jun 2023

In this paper, we address the problem of short-term action anticipation, i. e., we want to predict an upcoming action one second before it happens.

26 Jun 2023

Paper
Code

Enhancing Next Active Object-based Egocentric Action Anticipation with Guided Attention

sanketsans/ganov2 • • 22 May 2023

To this end, we propose a novel approach that applies a guided attention mechanism between the objects, and the spatiotemporal features extracted from video clips, enhancing the motion and contextual information, and further decoding the object-centric and motion-centric information to address the problem of STA in egocentric videos.

22 May 2023

Paper
Code

Fine-grained Affordance Annotation for Egocentric Hand-Object Interaction Videos

zch-yu/epic-affordance-annotation • 7 Feb 2023

Object affordance is an important concept in hand-object interaction, providing information on action possibilities based on human motor capacity and objects' physical property thus benefiting tasks such as action anticipation and robot imitation learning.

07 Feb 2023

Paper
Code

Anticipative Feature Fusion Transformer for Multi-Modal Action Anticipation

zeyun-zhong/afft • • 23 Oct 2022

Although human action anticipation is a task which is inherently multi-modal, state-of-the-art methods on well known action anticipation datasets leverage this data by applying ensemble methods and averaging scores of unimodal anticipation networks.

23 Oct 2022

Paper
Code