HVU (Holistic Video Understanding)

Introduced by Diba et al. in Large Scale Holistic Video Understanding

HVU is organized hierarchically in a semantic taxonomy that focuses on multi-label and multi-task video understanding as a comprehensive problem that encompasses the recognition of multiple semantic aspects in the dynamic scene. HVU contains approx.~572k videos in total with 9 million annotations for training, validation, and test set spanning over 3142 labels. HVU encompasses semantic aspects defined on categories of scenes, objects, actions, events, attributes, and concepts which naturally captures the real-world scenarios.

Source: Large Scale Holistic Video Understanding


Paper Code Results Date Stars


Similar Datasets


  • Unknown