Scanpath prediction

11 papers with code • 3 benchmarks • 2 datasets

Learning to Predict Sequences of Human Fixations.

Most implemented papers

SaltiNet: Scan-path Prediction on 360 Degree Images using Saliency Volumes

massens/saliency-360salient-2017 11 Jul 2017

The first part of the network consists of a model trained to generate saliency volumes, whose parameters are fit by back-propagation computed from a binary cross entropy (BCE) loss over downsampled versions of the saliency volumes.

Variational Laws of Visual Attention for Dynamic Scenes

dariozanca/eymol NeurIPS 2017

We devise variational laws of the eye-movement that rely on a generalized view of the Least Action Principle in physics.

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

imatge-upc/pathgan 3 Sep 2018

We introduce PathGAN, a deep neural network for visual scanpath prediction trained on adversarial examples.

Gravitational Laws of Focus of Attention

dariozanca/G-Eymol IEEE Transactions on Pattern Analysis and Machine Intelligence 2019

The understanding of the mechanisms behind focus of attention in a visual scene is a problem of great interest in visual perception and computer vision.

Predicting Human Scanpaths in Visual Question Answering

chenxy99/Scanpaths CVPR 2021

Conditioned on a task guidance map, the proposed model learns question-specific attention patterns to generate scanpaths.

ScanDMM: A Deep Markov Model of Scanpath Prediction for 360deg Images

xiangjiesui/scandmm CVPR 2023

Scanpath prediction for 360deg images aims to produce dynamic gaze behaviors based on the human visual perception mechanism.

Unifying Top-down and Bottom-up Scanpath Prediction Using Transformers

cvlab-stonybrook/hat CVPR 2024

Most models of visual attention aim at predicting either top-down or bottom-up control, as studied using different visual search and free-viewing tasks.

Gazeformer: Scalable, Effective and Fast Prediction of Goal-Directed Human Attention

cvlab-stonybrook/gazeformer CVPR 2023

In response, we pose a new task called ZeroGaze, a new variant of zero-shot learning where gaze is predicted for never-before-searched objects, and we develop a novel model, Gazeformer, to solve the ZeroGaze problem.

Pathformer3D: A 3D Scanpath Transformer for 360° Images

lsztzp/pathformer3d 15 Jul 2024

Then, the contextual feature representation and historical fixation information are input into a Transformer decoder to output current time step's fixation embedding, where the self-attention module is used to imitate the visual working memory mechanism of human visual system and directly model the time dependencies among the fixations.