Spatio-temporal Co-Occurrence Characterizations for Human Action Classification

The human action classification task is a widely researched topic and is still an open problem. Many state-of-the-arts approaches involve the usage of bag-of-video-words with spatio-temporal local features to construct characterizations for human actions. In order to improve beyond this standard approach, we investigate the usage of co-occurrences between local features. We propose the usage of co-occurrences information to characterize human actions. A trade-off factor is used to define an optimal trade-off between vocabulary size and classification rate. Next, a spatio-temporal co-occurrence technique is applied to extract co-occurrence information between labeled local features. Novel characterizations for human actions are then constructed. These include a vector quantized correlogram-elements vector, a highly discriminative PCA (Principal Components Analysis) co-occurrence vector and a Haralick texture vector. Multi-channel kernel SVM (support vector machine) is utilized for classification. For evaluation, the well known KTH as well as the challenging UCF-Sports action datasets are used. We obtained state-of-the-arts classification performance. We also demonstrated that we are able to fully utilize co-occurrence information, and improve the standard bag-of-video-words approach.

PDF Abstract

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods