Saliency Prediction
85 papers with code • 3 benchmarks • 7 datasets
A saliency map is a model that predicts eye fixations on a visual scene. Saliency prediction is informed by the human visual attention mechanism and predicts the possibility of the human eyes to stay in a certain position in the scene.
Libraries
Use these libraries to find Saliency Prediction models and implementationsLatest papers with no code
DiffGaze: A Diffusion Model for Continuous Gaze Sequence Generation on 360° Images
We present DiffGaze, a novel method for generating realistic and diverse continuous human gaze sequences on 360{\deg} images based on a conditional score-based denoising diffusion model.
Learning User Embeddings from Human Gaze for Personalised Saliency Prediction
At the core of our method is a Siamese convolutional neural encoder that learns the user embeddings by contrasting the image and personal saliency map pairs of different users.
A Modified Word Saliency-Based Adversarial Attack on Text Classification Models
This paper introduces a novel adversarial attack method targeting text classification models, termed the Modified Word Saliency-based Adversarial At-tack (MWSAA).
DiffSal: Joint Audio and Video Learning for Diffusion Saliency Prediction
Audio-visual saliency prediction can draw support from diverse modality complements, but further performance enhancement is still challenged by customized architectures as well as task-specific loss functions.
Transformer-based Video Saliency Prediction with High Temporal Dimension Decoding
In recent years, finding an effective and efficient strategy for exploiting spatial and temporal information has been a hot research topic in video saliency prediction (VSP).
Learning Saliency From Fixations
We present a novel approach for saliency prediction in images, leveraging parallel decoding in transformers to learn saliency solely from fixation maps.
Audio-visual Saliency for Omnidirectional Videos
Visual saliency prediction for omnidirectional videos (ODVs) has shown great significance and necessity for omnidirectional videos to help ODV coding, ODV transmission, ODV rendering, etc..
XAI-CLASS: Explanation-Enhanced Text Classification with Extremely Weak Supervision
However, these methods ignore the importance of incorporating the explanations of the generated pseudo-labels, or saliency of individual words, as additional guidance during the text classification training process.
UniST: Towards Unifying Saliency Transformer for Video Saliency Prediction and Detection
While many approaches have crafted task-specific training paradigms for either video saliency prediction or video salient object detection tasks, few attention has been devoted to devising a generalized saliency modeling framework that seamlessly bridges both these distinct tasks.
Attention for Robot Touch: Tactile Saliency Prediction for Robust Sim-to-Real Tactile Control
To improve the robustness of tactile robot control in unstructured environments, we propose and study a new concept: \textit{tactile saliency} for robot touch, inspired by the human touch attention mechanism from neuroscience and the visual saliency prediction problem from computer vision.