DHF1K is a video saliency dataset which contains a ground-truth map of binary pixel-wise gaze fixation points and a continuous map of the fixation points after being blurred by a gaussian filter. DHF1K contains 1000 videos in total. 700 of the videos are annotated, 600 of which are used for training and 100 for validation. The remaining 300 are the testing set which are to be evaluated on a public server.
19 PAPERS • 1 BENCHMARK
The dataset presents open high-resolution test clips set with different types of content: movie fragments, sport streams, live caption clips. Used clips of 1920×1080 resolution and with duration from 13 to 38 seconds. And Performed reliable data collection from 50 observers (19–24 y. o.) using 500 Hz SMI iViewXTM Hi-Speed 1250 eye-tracker. Also used cross-fade which ensures the independence of the received fixations between different clips. The final ground-truth saliency map was estimated as a Gaussian mixture with centers at the fixation points. A standard deviation for the Gaussians equal to 120 was chosen (this value matches 8 angular degrees, which is known to be the sector of sharp vision).
14 PAPERS • 1 BENCHMARK