AP-10K is the first large-scale benchmark for general animal pose estimation, to facilitate the research in animal pose estimation. AP-10K consists of 10,015 images collected and filtered from 23 animal families and 60 species following the taxonomic rank and high-quality keypoint annotations labeled and checked manually.
31 PAPERS • 1 BENCHMARK
Animal Kingdom is a large and diverse dataset that provides multiple annotated tasks to enable a more thorough understanding of natural animal behaviors. The wild animal footage used in the dataset records different times of the day in an extensive range of environments containing variations in backgrounds, viewpoints, illumination and weather conditions. More specifically, the dataset contains 50 hours of annotated videos to localize relevant animal behavior segments in long videos for the video grounding task, 30K video sequences for the fine-grained multi-label action recognition task, and 33K frames for the pose estimation task, which correspond to a diverse range of animals with 850 species across 6 major animal classes.
21 PAPERS • 2 BENCHMARKS
Animal-Pose Dataset is an animal pose dataset to facilitate training and evaluation. This dataset provides animal pose annotations on five categories are provided: dog, cat, cow, horse, sheep, with in total 6,000+ instances in 4,000+ images. Besides, the dataset also contains bounding box annotations for other 7 animal categories.
18 PAPERS • 1 BENCHMARK
An 'in the wild' dataset of 20,580 dog images for which 2D joint and silhouette annotations were collected.
12 PAPERS • 1 BENCHMARK
The ATRW Dataset contains over 8,000 video clips from 92 Amur tigers, with bounding box, pose keypoint, and tiger identity annotations.
8 PAPERS • NO BENCHMARKS YET
Accurately estimating the 3D pose and shape is an essential step towards understanding animal behavior, and can potentially benefit many downstream applications, such as wildlife conservation. However, research in this area is held back by the lack of a comprehensive and diverse dataset with high-quality 3D pose and shape annotations. In this paper, we propose Animal3D, the first comprehensive dataset for mammal animal 3D pose and shape estimation. Animal3D consists of 3379 images collected from 40 mammal species, high-quality annotations of 26 keypoints, and importantly the pose and shape parameters of the SMAL model. All annotations were labeled and checked manually in a multi-stage process to ensure highest quality results. Based on the Animal3D dataset, we benchmark representative shape and pose estimation models at: (1) supervised learning from only the Animal3D data, (2) synthetic to real transfer from synthetically generated images, and (3) fine-tuning human pose and shape estim
8 PAPERS • 1 BENCHMARK
Horse-10 is an animal pose estimation dataset. It comprises 30 diverse Thoroughbred horses, for which 22 body parts were labeled by an expert in 8,114 frames (animal pose estimation). Horses have various coat colors and the “in-the-wild” aspect of the collected data at various Thoroughbred yearling sales and farms added additional complexity. The authors introduce Horse-C to contrast the domain shift inherent in the Horse-10 dataset with domain shift induced by common image corruptions.
6 PAPERS • 1 BENCHMARK
AwA Pose is a large scale animal keypoint dataset with ground truth annotations for keypoint detection of quadruped animals from images.
4 PAPERS • NO BENCHMARKS YET
Three wild-type (C57BL/6J) male mice ran on a paper spool following odor trails (Mathis et al 2018). These experiments were carried out in the laboratory of Venkatesh N. Murthy at Harvard University. Data were recorded at 30 Hz with 640 x 480 pixels resolution acquired with a Point Grey Firefly FMVU-03MTM-CS. One human annotator was instructed to localize the 12 keypoints (snout, left ear, right ear, shoulder, four spine points, tail base and three tail points). All surgical and experimental procedures for mice were in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals and approved by the Harvard Institutional Animal Care and Use Committee. 161 frames were labeled, making this a real-world sized laboratory dataset.
3 PAPERS • 1 BENCHMARK
The dataset is designed specifically to solve a range of computer vision problems (2D-3D tracking, posture) faced by biologists while designing behavior studies with animals.
2 PAPERS • NO BENCHMARKS YET
Schools of inland silversides (Menidia beryllina, n=14 individuals per school) were recorded in the Lauder Lab at Harvard University while swimming at 15 speeds (0.5 to 8 BL/s, body length, at 0.5 BL/s intervals) in a flow tank with a total working section of 28 x 28 x 40 cm as described in previous work, at a constant temperature (18±1°C) and salinity (33 ppt), at a Reynolds number of approximately 10,000 (based on BL). Dorsal views of steady swimming across these speeds were recorded by high-speed video cameras (FASTCAM Mini AX50, Photron USA, San Diego, CA, USA) at 60-125 frames per second (feeding videos at 60 fps, swimming alone 125 fps). The dorsal view was recorded above the swim tunnel and a floating Plexiglas panel at the water surface prevented surface ripples from interfering with dorsal view videos. Five keypoints were labeled (tip, gill, peduncle, dorsal fin tip, caudal tip). 100 frames were labeled, making this a real-world sized laboratory dataset.
2 PAPERS • 1 BENCHMARK
Dataset page: https://github.com/mosamdabhi/MBW-Data
All animal procedures are overseen by veterinary staff of the MIT and Broad Institute Department of Comparative Medicine, in compliance with the NIH guide for the care and use of laboratory animals and approved by the MIT and Broad Institute animal care and use committees. Video of common marmosets (Callithrix jacchus) was collected in the laboratory of Guoping Feng at MIT. Marmosets were recorded using Kinect V2 cameras (Microsoft) with a resolution of 1080p and frame rate of 30 Hz. After acquisition, images to be used for training the network were manually cropped to 1000 x 1000 pixels or smaller. The dataset is 7,600 labeled frames from 40 different marmosets collected from 3 different colonies (in different facilities). Each cage contains a pair of marmosets, where one marmoset had light blue dye applied to its tufts. One human annotator labeled the 15 marker points on each animal present in the frame (frames contained either 1 or 2 animals).
Desert Locus is a animal pose estimation dataset for desert locuses.
1 PAPER • 1 BENCHMARK
MacaquePose is an animal pose estimation dataset containing pictures of macaque monkeys and manually labeled annotations on them.
Vinegar Fly is a pose estimation dataset for fruit flies.
Grévy’s Zebra is an animal pose estimation dataset for zebras.
0 PAPER • NO BENCHMARKS YET