Animal Kingdom is a large and diverse dataset that provides multiple annotated tasks to enable a more thorough understanding of natural animal behaviors. The wild animal footage used in the dataset records different times of the day in an extensive range of environments containing variations in backgrounds, viewpoints, illumination and weather conditions. More specifically, the dataset contains 50 hours of annotated videos to localize relevant animal behavior segments in long videos for the video grounding task, 30K video sequences for the fine-grained multi-label action recognition task, and 33K frames for the pose estimation task, which correspond to a diverse range of animals with 850 species across 6 major animal classes.
15 PAPERS • 2 BENCHMARKS
The COCO-MLT is created from MS COCO-2017, containing 1,909 images from 80 classes. The maximum of training number per class is 1,128 and the minimum is 6. We use the test set of COCO2017 with 5,000 for evaluation. The ratio of head, medium, and tail classes is 22:33:25 in COCO-MLT.
12 PAPERS • 2 BENCHMARKS
We construct the long-tailed version of VOC from its 2012 train-val set. It contains 1,142 images from 20 classes, with a maximum of 775 images per class and a minimum of 4 images per class. The ratio of head, medium, and tail classes after splitting is 6:6:8. We evaluate the performance on VOC2007 test set with 4952 images.