Recently, few-shot learning has received increasing interest.
However, most existing models developed for these tasks are pre-trained on general video action classification tasks.
Ranked #9 on Temporal Action Localization on ActivityNet-1.3
In this challenge, action recognition is posed as the problem of simultaneously predicting a single `verb' and `noun' class label given an input trimmed video clip.
Departing from existing alternatives, our W3 module models all three facets of video attention jointly.
Ranked #1 on Action Recognition on EPIC-KITCHENS-55
In this paper, we introduce the task of retrieving relevant video moments from a large corpus of untrimmed, unsegmented videos given a natural language query.
The guest tasks focused on complementary aspects of the activity recognition problem at large scale and involved three challenging and recently compiled datasets: the Kinetics-600 dataset from Google DeepMind, the AVA dataset from Berkeley and Google, and the Moments in Time dataset from MIT and IBM Research.
Despite the recent progress in video understanding and the continuous rate of improvement in temporal action localization throughout the years, it is still unclear how far (or close?)
Second, we propose an actor-based attention mechanism that enables the localization of the actions from action class labels and actor proposals and is end-to-end trainable.
The ActivityNet Large Scale Activity Recognition Challenge 2017 Summary: results and challenge participants papers.
Despite the recent advances in large-scale video analysis, action detection remains as one of the most challenging unsolved problems in computer vision.
Our paper presents a new approach for temporal detection of human actions in long, untrimmed video sequences.
In spite of many dataset efforts for human action recognition, current computer vision algorithms are still severely limited in terms of the variability and complexity of the actions that they can recognize.
One of the cornerstone principles of deep models is their abstraction capacity, i. e. their ability to learn abstract concepts from `simpler' ones.