Activity Recognition
254 papers with code • 4 benchmarks • 29 datasets
Human Activity Recognition is the problem of identifying events performed by humans given a video input. It is formulated as a binary (or multiclass) classification problem of outputting activity class labels. Activity Recognition is an important problem with many societal applications including smart surveillance, video search/retrieval, intelligent robots, and other monitoring systems.
Source: Learning Latent Sub-events in Activity Videos Using Temporal Attention Filters
Libraries
Use these libraries to find Activity Recognition models and implementationsDatasets
Subtasks
Most implemented papers
Kernel Cross-Correlator
Cross-correlator plays a significant role in many visual perception tasks, such as object detection and tracking.
Fine-grained Activity Recognition in Baseball Videos
In this paper, we introduce a challenging new dataset, MLB-YouTube, designed for fine-grained activity detection.
Eidetic 3D LSTM: A Model for Video Prediction and Beyond
We first evaluate the E3D-LSTM network on widely-used future video prediction datasets and achieve the state-of-the-art performance.
Large-scale weakly-supervised pre-training for video action recognition
Second, frame-based models perform quite well on action recognition; is pre-training for good image features sufficient or is pre-training for spatio-temporal features valuable for optimal transfer learning?
NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding
Research on depth-based human activity analysis achieved outstanding performance and demonstrated the effectiveness of 3D representation for action recognition.
Towards Fairness in Visual Recognition: Effective Strategies for Bias Mitigation
We design a simple but surprisingly effective visual recognition benchmark for studying bias mitigation.
SeCo: Exploring Sequence Supervision for Unsupervised Representation Learning
In this paper, we compose a trilogy of exploring the basic and generic supervision in the sequence from spatial, spatiotemporal and sequential perspectives.
A Probabilistic Logic Programming Event Calculus
The input of our system is a set of time-stamped short-term activities (STA) detected on video frames.
Zero-Shot Activity Recognition with Verb Attribute Induction
In this paper, we investigate large-scale zero-shot activity recognition by modeling the visual and linguistic attributes of action verbs.
PoseTrack: A Benchmark for Human Pose Estimation and Tracking
In this work, we aim to further advance the state of the art by establishing "PoseTrack", a new large-scale benchmark for video-based human pose estimation and articulated tracking, and bringing together the community of researchers working on visual human analysis.