The Human3.6M dataset is one of the largest motion capture datasets, which consists of 3.6 million human poses and corresponding images captured by a high-speed motion capture system. There are 4 high-resolution progressive scan cameras to acquire video data at 50 Hz. The dataset contains activities by 11 professional actors in 17 scenarios: discussion, smoking, taking photo, talking on the phone, etc., as well as provides accurate 3D joint positions and high-resolution videos.
719 PAPERS • 16 BENCHMARKS
DeepFashion is a dataset containing around 800K diverse fashion images with their rich annotations (46 categories, 1,000 descriptive attributes, bounding boxes and landmark information) ranging from well-posed product images to real-world-like consumer photos.
364 PAPERS • 6 BENCHMARKS
Thai-Chi-HD is a high resolution dataset which can be used as reference benchmark for evaluating frameworks for image animation and video generation. It consists of cropped videos of full human bodies performing Tai Chi actions.
9 PAPERS • 2 BENCHMARKS