no code implementations • RepL4NLP (ACL) 2022 • Adnen Abdessaied, Ekta Sood, Andreas Bulling
We propose the Video Language Co-Attention Network (VLCN) – a novel memory-enhanced model for Video Question Answering (VideoQA).
no code implementations • 25 Oct 2023 • Adnen Abdessaied, Lei Shi, Andreas Bulling
We propose $\mathbb{VD}$-$\mathbb{GR}$ - a novel visual dialog model that combines pre-trained language models (LMs) with graph neural networks (GNNs).
no code implementations • 16 Aug 2023 • Philipp Müller, Michal Balazia, Tobias Baur, Michael Dietz, Alexander Heimerl, Dominik Schiller, Mohammed Guermal, Dominike Thomas, François Brémond, Jan Alexandersson, Elisabeth André, Andreas Bulling
This paper describes the MultiMediate'23 challenge and presents novel sets of annotations for both tasks.
no code implementations • 20 Jun 2023 • Anna Penzkofer, Simon Schaefer, Florian Strohm, Mihai Bâce, Stefan Leutenegger, Andreas Bulling
We show that intentions of human players, i. e. the precursor of goal-oriented decisions, can be robustly predicted from eye gaze even for the long-horizon sparse rewards task of Montezuma's Revenge - one of the most challenging RL tasks in the Atari2600 game suite.
1 code implementation • COLING 2022 • Adnen Abdessaied, Mihai Bâce, Andreas Bulling
We propose Neuro-Symbolic Visual Dialog (NSVD) -the first method to combine deep learning and symbolic program execution for multi-round visually-grounded reasoning.
no code implementations • 30 Apr 2022 • Ahmed Abdou, Ekta Sood, Philipp Müller, Andreas Bulling
Emotional expressions are inherently multimodal -- integrating facial behavior, speech, and gaze -- but their automatic recognition is often limited to a single modality, e. g. speech during a phone call.
no code implementations • 4 Dec 2021 • Yao Wang, Mihai Bâce, Andreas Bulling
We propose Unified Model of Saliency and Scanpaths (UMSS) -- a model that learns to predict visual saliency and scanpaths (i. e. sequences of eye fixations) on information visualisations.
no code implementations • 27 Sep 2021 • Ekta Sood, Fabian Kögel, Philipp Müller, Dominike Thomas, Mihai Bace, Andreas Bulling
We present the Multimodal Human-like Attention Network (MULAN) - the first method for multimodal integration of human-like attention on image and text during training of VQA models.
no code implementations • CoNLL (EMNLP) 2021 • Ekta Sood, Fabian Kögel, Florian Strohm, Prajit Dhar, Andreas Bulling
We present VQA-MHUG - a novel 49-participant dataset of multimodal human gaze on both images and questions during visual question answering (VQA) collected using a high-speed eye tracker.
no code implementations • ICCV 2021 • Florian Strohm, Ekta Sood, Sven Mayer, Philipp Müller, Mihai Bâce, Andreas Bulling
The encoder extracts image features and predicts a neural activation map for each face looked at by a human observer.
no code implementations • NeurIPS 2020 • Ekta Sood, Simon Tannert, Philipp Mueller, Andreas Bulling
A lack of corpora has so far limited advances in integrating human gaze data as a supervisory signal in neural attention mechanisms for natural language processing(NLP).
no code implementations • CONLL 2020 • Ekta Sood, Simon Tannert, Diego Frassinelli, Andreas Bulling, Ngoc Thang Vu
We compare state of the art networks based on long short-term memory (LSTM), convolutional neural models (CNN) and XLNet Transformer architectures.
no code implementations • 25 Jul 2019 • Mihai Bâce, Sander Staal, Andreas Bulling
With an ever-increasing number of mobile devices competing for our attention, quantifying when, how often, or for how long users visually attend to their devices has emerged as a core challenge in mobile human-computer interaction.
no code implementations • 25 Jul 2019 • Mihai Bâce, Sander Staal, Andreas Bulling
Moreover, we discuss how our method enables the calculation of additional attention metrics that, for the first time, enable researchers from different domains to study and quantify attention allocation during mobile interactions in the wild.
2 code implementations • 12 May 2018 • Seonwook Park, Xucong Zhang, Andreas Bulling, Otmar Hilliges
Conventional feature-based and model-based gaze estimation methods have proven to perform well in settings with controlled illumination and specialized cameras.
1 code implementation • LREC 2018 • Arif Khan, Ingmar Steiner, Yusuke Sugano, Andreas Bulling, Ross Macdonald
Phonetic segmentation is the process of splitting speech into distinct phonetic units.
6 code implementations • 24 Nov 2017 • Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling
Second, we present an extensive evaluation of state-of-the-art gaze estimation methods on three current datasets, including MPIIGaze.
no code implementations • 19 Jun 2017 • Hosnieh Sattar, Mario Fritz, Andreas Bulling
Such visual decoding is challenging for two reasons: 1) the search target only resides in the user's mind as a subjective visual pattern, and can most often not even be described verbally by the person, and 2) it is, as of yet, unclear if gaze fixations contain sufficient information for this task at all.
no code implementations • 27 Apr 2017 • Erroll Wood, Tadas Baltrusaitis, Louis-Philippe Morency, Peter Robinson, Andreas Bulling
We present GazeDirector, a new approach for eye gaze redirection that uses model-fitting.
no code implementations • CVPR 2017 • Nour Karessli, Zeynep Akata, Bernt Schiele, Andreas Bulling
Zero-shot image classification using auxiliary information, such as attributes describing discriminative object properties, requires time-consuming annotation by domain experts.
4 code implementations • 27 Nov 2016 • Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling
Eye gaze is an important non-verbal cue for human affect analysis.
no code implementations • 27 Nov 2016 • Hosnieh Sattar, Andreas Bulling, Mario Fritz
Predicting the target of visual search from eye fixation (gaze) data is a challenging problem with many applications in human-computer interaction.
no code implementations • 8 Sep 2016 • Sabrina Hoppe, Andreas Bulling
Common computational methods for automated eye movement detection - i. e. the task of detecting different types of eye movement in a continuous stream of gaze data - are limited in that they either involve thresholding on hand-crafted signal features, require individual detectors each only detecting a single movement, or require pre-segmented data.
no code implementations • 18 Aug 2016 • Yusuke Sugano, Andreas Bulling
Gaze reflects how humans process visual scenes and is therefore increasingly used in computer vision systems.
no code implementations • 16 Feb 2016 • Sreyasi Nag Chowdhury, Mateusz Malinowski, Andreas Bulling, Mario Fritz
We show that our retrieval system can cope with this variability using personalisation through an online learning-based retrieval formulation.
no code implementations • 11 Jan 2016 • Mohsen Mansouryar, Julian Steil, Yusuke Sugano, Andreas Bulling
3D gaze information is important for scene-centric attention analysis but accurate estimation and analysis of 3D gaze in real-world environments remains challenging.
no code implementations • 18 Nov 2015 • Marc Tonsen, Xucong Zhang, Yusuke Sugano, Andreas Bulling
We further study the influence of image resolution, vision aids, as well as recording location (indoor, outdoor) on pupil detection performance.
no code implementations • 21 May 2015 • Iaroslav Shcherbatyi, Andreas Bulling, Mario Fritz
An increasing number of works explore collaborative human-computer systems in which human gaze is used to enhance computer vision systems.
no code implementations • ICCV 2015 • Erroll Wood, Tadas Baltrusaitis, Xucong Zhang, Yusuke Sugano, Peter Robinson, Andreas Bulling
Images of the eye are key in several computer vision problems, such as shape registration and gaze estimation.
6 code implementations • CVPR 2015 • Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling
Appearance-based gaze estimation is believed to work well in real-world settings, but existing datasets have been collected under controlled laboratory conditions and methods have been not evaluated across multiple datasets.
no code implementations • CVPR 2015 • Hosnieh Sattar, Sabine Müller, Mario Fritz, Andreas Bulling
Previous work on predicting the target of visual search from human fixations only considered closed-world settings in which training labels are available and predictions are performed for a known set of potential targets.
1 code implementation • 30 Apr 2014 • Moritz Kassner, William Patera, Andreas Bulling
Commercial head-mounted eye trackers provide useful features to customers in industry and research but are expensive and rely on closed source hardware and software.
no code implementations • 6 Mar 2014 • Mark Simkin, Dominique Schroeder, Andreas Bulling, Mario Fritz
We describe Ubic, a framework that allows users to bridge the gap between digital cryptography and the physical world.