…The segments are of varying length, between 3 and 10 seconds long, and in each clip the only visible face in the video and audible sound in the soundtrack belong to a single speaking person. In total, the dataset contains roughly 4700 hours of video segments with approximately 150,000 distinct speakers, spanning a wide variety of people, languages and face poses.
35 PAPERS • NO BENCHMARKS YET
A high-resolution version of VGGFace2 for academic face editing purposes. This project uses GFPGAN for image restoration and insightface for data preprocessing (crop and align).
1 PAPER • NO BENCHMARKS YET
…(1) wikiann · Datasets at Hugging Face. https://huggingface.co/datasets/wikiann. (2) wikiann | TensorFlow Datasets. https://tensorflow.google.cn/datasets/catalog/wikiann. (3) wikiann · Datasets at Hugging Face. https://huggingface.co/datasets/wikiann/viewer/en. (4) WikiAnn Dataset | Papers With Code. https://paperswithcode.com/dataset/wikiann-1.
58 PAPERS • 3 BENCHMARKS
…The real images of complex scenes consist of 8 forward-facing scenes captured with a cellphone at a size of 1008x756 pixels.
2,669 PAPERS • 1 BENCHMARK
…Sparsity: The dataset faces the challenge of sparse distribution of polls on Weibo, as less than 0.1% of the randomly gathered posts contained polls.
3 PAPERS • 3 BENCHMARKS