Human-Animal-Cartoon (HAC) dataset consists of seven actions (‘sleeping’, ‘watching tv’, ‘eating’, ‘drinking’, ‘swimming’, ‘running’, and ‘opening door’) performed by humans, animals, and cartoon figures, forming three different domains. 3381 video clips are collected from the internet with around 1000 for each domain and three modalities are provided in the dataset: video, audio, and optical flow.
1 PAPER • NO BENCHMARKS YET