1 code implementation • 1 Jun 2023 • Hossein Adeli, Seoyoung Ahn, Nikolaus Kriegeskorte, Gregory Zelinsky
We found that our models of affinity spread that were built on feature maps from the self-supervised Transformers showed significant improvement over baseline and CNN based models on predicting reaction time patterns of humans, despite not being trained on the task or with any other object labels.
1 code implementation • CVPR 2023 • Sounak Mondal, Zhibo Yang, Seoyoung Ahn, Dimitris Samaras, Gregory Zelinsky, Minh Hoai
In response, we pose a new task called ZeroGaze, a new variant of zero-shot learning where gaze is predicted for never-before-searched objects, and we develop a novel model, Gazeformer, to solve the ZeroGaze problem.
1 code implementation • 16 Mar 2023 • Zhibo Yang, Sounak Mondal, Seoyoung Ahn, Ruoyu Xue, Gregory Zelinsky, Minh Hoai, Dimitris Samaras
Most models of visual attention aim at predicting either top-down or bottom-up control, as studied using different visual search and free-viewing tasks.
1 code implementation • 4 Jul 2022 • Zhibo Yang, Sounak Mondal, Seoyoung Ahn, Gregory Zelinsky, Minh Hoai, Dimitris Samaras
In this paper, we propose the first data-driven computational model that addresses the search-termination problem and predicts the scanpath of search fixations made by people searching for targets that do not appear in images.
1 code implementation • 11 Oct 2021 • Hossein Adeli, Seoyoung Ahn, Gregory Zelinsky
The visual system processes a scene using a sequence of selective glimpses, each driven by spatial and object-based attention.
no code implementations • 14 Sep 2020 • Raji Annadi, Yupei Chen, Viresh Ranjan, Dimitris Samaras, Gregory Zelinsky, Minh Hoai
Analyzing the collected gaze behavior of ten human participants on thirty crowd images, we observe some common approaches for visual counting.
2 code implementations • CVPR 2020 • Zhibo Yang, Lihan Huang, Yupei Chen, Zijun Wei, Seoyoung Ahn, Gregory Zelinsky, Dimitris Samaras, Minh Hoai
These maps were learned by IRL and then used to predict behavioral scanpaths for multiple target categories.
no code implementations • 25 Sep 2019 • Seoyoung Ahn, Gregory Zelinsky, Gary Lupyan
We investigated the changes in visual representations learnt by CNNs when using different linguistic labels (e. g., trained with basic-level labels only, superordinate-level only, or both at the same time) and how they compare to human behavior when asked to select which of three images is most different.
no code implementations • 23 Nov 2018 • Hossein Adeli, Gregory Zelinsky
Here we extend this work by building a more brain-inspired deep network model of the primate ATTention Network (ATTNet) that learns to shift its attention so as to maximize the reward.
no code implementations • 10 Dec 2016 • Hieu Le, Chen-Ping Yu, Gregory Zelinsky, Dimitris Samaras
Co-localization is the problem of localizing objects of the same class using only the set of images that contain them.
Ranked #1 on Object Localization on PASCAL VOC 2012
no code implementations • ICCV 2015 • Chen-Ping Yu, Hieu Le, Gregory Zelinsky, Dimitris Samaras
Video segmentation is the task of grouping similar pixels in the spatio-temporal domain, and has become an important preprocessing step for subsequent video analysis.