VOCASET is a 4D face dataset with about 29 minutes of 4D scans captured at 60 fps and synchronized audio. The dataset has 12 subjects and 480 sequences of about 3-4 seconds each with sentences chosen from an array of standard protocols that maximize phonetic diversity.
42 PAPERS • 1 BENCHMARK
A synthetic data of videos of human action sequences and the corresponding optical flow.
3 PAPERS • NO BENCHMARKS YET