no code implementations • RepL4NLP (ACL) 2022 • Adnen Abdessaied, Ekta Sood, Andreas Bulling
We propose the Video Language Co-Attention Network (VLCN) – a novel memory-enhanced model for Video Question Answering (VideoQA).
no code implementations • 9 Mar 2025 • Lei Shi, Andreas Bulling
By evaluating ablated versions of our method, we further show that the proposed integration of the action and observation representations learnt in the VAE latent space is key to these performance improvements.
no code implementations • 9 Dec 2024 • Florian Strohm, Mihai Bâce, Andreas Bulling
We evaluate HAIFAI and HAIFAI-X in a 12-participant user study and show that HAIFAI outperforms the previous state of the art regarding reconstruction quality, usability, perceived workload, and reconstruction speed.
no code implementations • 21 Oct 2024 • Zhiming Hu, Guanhua Zhang, Zheming Yin, Daniel Haeufle, Syn Schmitt, Andreas Bulling
Human hand and head movements are the most pervasive input modalities in extended reality (XR) and are significant for a wide range of applications.
no code implementations • 23 Aug 2024 • Daniel Habermann, Marvin Schmitt, Lars Kühmichel, Andreas Bulling, Stefan T. Radev, Paul-Christian Bürkner
Multilevel models (MLMs) are a central building block of the Bayesian workflow.
no code implementations • 9 Jul 2024 • Matteo Bortoletto, Constantin Ruhdorfer, Lei Shi, Andreas Bulling
We propose MToMnet - a Theory of Mind (ToM) neural network for predicting beliefs and their dynamics during human social interactions from multimodal input.
no code implementations • 2 Jul 2024 • Adnen Abdessaied, Lei Shi, Andreas Bulling
Then, it predicts the missing underlying structure of the selected constituents of each modality by learning local latent graphs using a novel multi-modal graph structure learning method.
no code implementations • 2 Jul 2024 • Zhiming Hu, Zheming Yin, Daniel Haeufle, Syn Schmitt, Andreas Bulling
We present HOIMotion - a novel approach for human motion forecasting during human-object interactions that integrates information about past body poses and egocentric 3D object bounding boxes.
no code implementations • 25 Jun 2024 • Constantin Ruhdorfer, Matteo Bortoletto, Anna Penzkofer, Andreas Bulling
We introduce the Overcooked Generalisation Challenge (OGC) - the first benchmark to study agents' zero-shot cooperation abilities when faced with novel partners and levels in the Overcooked-AI environment.
no code implementations • 25 Jun 2024 • Matteo Bortoletto, Constantin Ruhdorfer, Lei Shi, Andreas Bulling
We are the first to study how prompt variations impact probing performance on theory of mind tasks.
no code implementations • 21 May 2024 • Matteo Bortoletto, Constantin Ruhdorfer, Adnen Abdessaied, Lei Shi, Andreas Bulling
This finding calls for a deeper understanding of the role of ToM in CPA and beyond, as well as new methods for modelling and evaluating mental states in computational collaborative agents.
no code implementations • 6 May 2024 • Anna Penzkofer, Lei Shi, Andreas Bulling
Our method is based on the Semantic Pointer Architecture (SPA) to encode objects in a hyperdimensional vector space.
no code implementations • 26 Mar 2024 • Chuhan Jiao, Yao Wang, Guanhua Zhang, Mihai Bâce, Zhiming Hu, Andreas Bulling
We present DiffGaze, a novel method for generating realistic and diverse continuous human gaze sequences on 360{\deg} images based on a conditional score-based denoising diffusion model.
no code implementations • 20 Mar 2024 • Florian Strohm, Mihai Bâce, Andreas Bulling
We present User-predictable Face Editing (UP-FacE) -- a novel method for predictable face shape editing.
no code implementations • 20 Mar 2024 • Florian Strohm, Mihai Bâce, Andreas Bulling
At the core of our method is a Siamese convolutional neural encoder that learns the user embeddings by contrasting the image and personal saliency map pairs of different users.
no code implementations • 14 Mar 2024 • Zhiming Hu, Syn Schmitt, Daniel Haeufle, Andreas Bulling
We present GazeMotion, a novel method for human motion forecasting that combines information on past human poses with human eye gaze.
no code implementations • 13 Mar 2024 • Lei Shi, Paul Bürkner, Andreas Bulling
We show that by adding action embeddings into the noise mask the diffusion model can better learn action temporal dependencies and increase the performances on procedure planning.
no code implementations • 29 Feb 2024 • Mayar Elfares, Pascal Reisert, Zhiming Hu, Wenwu Tang, Ralf Küsters, Andreas Bulling
Latest gaze estimation methods require large-scale training data but their collection and exchange pose significant privacy risks.
no code implementations • 20 Feb 2024 • Adnen Abdessaied, Manuel von Hochmeister, Andreas Bulling
OLViT addresses these challenges by maintaining a global dialog state based on the output of an Object State Tracker (OST) and a Language State Tracker (LST): while the OST attends to the most important objects within the video, the LST keeps track of the most important linguistic co-references to previous dialog turns.
no code implementations • 19 Dec 2023 • Susanne Hindennach, Lei Shi, Filip Miletić, Andreas Bulling
When users perceive AI systems as mindful, independent agents, they hold them responsible instead of the AI experts who created and designed these systems.
no code implementations • 19 Dec 2023 • Zhiming Hu, Jiahui Xu, Syn Schmitt, Andreas Bulling
We compare our method with state-of-the-art methods that predict eye gaze only from head movements and show that Pose2Gaze outperforms these baselines with an average improvement of 24. 0% on MoGaze, 10. 1% on ADT, 21. 3% on GIMO, and 28. 6% on EgoBody in mean angular error, respectively.
no code implementations • 19 Dec 2023 • Haodong Yan, Zhiming Hu, Syn Schmitt, Andreas Bulling
Human motion prediction is important for many virtual and augmented reality (VR/AR) applications such as collision avoidance and realistic avatar generation.
no code implementations • 12 Dec 2023 • Matteo Bortoletto, Lei Shi, Andreas Bulling
We propose the Intuitive Reasoning Network (IRENE) - a novel neural model for intuitive psychological reasoning about agents' goals, preferences, and actions that can generalise previous experiences to new situations.
no code implementations • 25 Oct 2023 • Adnen Abdessaied, Lei Shi, Andreas Bulling
We propose $\mathbb{VD}$-$\mathbb{GR}$ - a novel visual dialog model that combines pre-trained language models (LMs) with graph neural networks (GNNs).
no code implementations • 16 Aug 2023 • Philipp Müller, Michal Balazia, Tobias Baur, Michael Dietz, Alexander Heimerl, Dominik Schiller, Mohammed Guermal, Dominike Thomas, François Brémond, Jan Alexandersson, Elisabeth André, Andreas Bulling
This paper describes the MultiMediate'23 challenge and presents novel sets of annotations for both tasks.
no code implementations • 20 Jun 2023 • Anna Penzkofer, Simon Schaefer, Florian Strohm, Mihai Bâce, Stefan Leutenegger, Andreas Bulling
We show that intentions of human players, i. e. the precursor of goal-oriented decisions, can be robustly predicted from eye gaze even for the long-horizon sparse rewards task of Montezuma's Revenge - one of the most challenging RL tasks in the Atari2600 game suite.
Deep Reinforcement Learning
Hierarchical Reinforcement Learning
+3
1 code implementation • COLING 2022 • Adnen Abdessaied, Mihai Bâce, Andreas Bulling
We propose Neuro-Symbolic Visual Dialog (NSVD) -the first method to combine deep learning and symbolic program execution for multi-round visually-grounded reasoning.
no code implementations • 30 Apr 2022 • Ahmed Abdou, Ekta Sood, Philipp Müller, Andreas Bulling
Emotional expressions are inherently multimodal -- integrating facial behavior, speech, and gaze -- but their automatic recognition is often limited to a single modality, e. g. speech during a phone call.
no code implementations • 4 Dec 2021 • Yao Wang, Mihai Bâce, Andreas Bulling
We propose Unified Model of Saliency and Scanpaths (UMSS) -- a model that learns to predict visual saliency and scanpaths (i. e. sequences of eye fixations) on information visualisations.
no code implementations • 27 Sep 2021 • Ekta Sood, Fabian Kögel, Philipp Müller, Dominike Thomas, Mihai Bace, Andreas Bulling
We present the Multimodal Human-like Attention Network (MULAN) - the first method for multimodal integration of human-like attention on image and text during training of VQA models.
no code implementations • CoNLL (EMNLP) 2021 • Ekta Sood, Fabian Kögel, Florian Strohm, Prajit Dhar, Andreas Bulling
We present VQA-MHUG - a novel 49-participant dataset of multimodal human gaze on both images and questions during visual question answering (VQA) collected using a high-speed eye tracker.
no code implementations • ICCV 2021 • Florian Strohm, Ekta Sood, Sven Mayer, Philipp Müller, Mihai Bâce, Andreas Bulling
The encoder extracts image features and predicts a neural activation map for each face looked at by a human observer.
no code implementations • NeurIPS 2020 • Ekta Sood, Simon Tannert, Philipp Mueller, Andreas Bulling
A lack of corpora has so far limited advances in integrating human gaze data as a supervisory signal in neural attention mechanisms for natural language processing(NLP).
no code implementations • CONLL 2020 • Ekta Sood, Simon Tannert, Diego Frassinelli, Andreas Bulling, Ngoc Thang Vu
We compare state of the art networks based on long short-term memory (LSTM), convolutional neural models (CNN) and XLNet Transformer architectures.
no code implementations • 25 Jul 2019 • Mihai Bâce, Sander Staal, Andreas Bulling
Moreover, we discuss how our method enables the calculation of additional attention metrics that, for the first time, enable researchers from different domains to study and quantify attention allocation during mobile interactions in the wild.
no code implementations • 25 Jul 2019 • Mihai Bâce, Sander Staal, Andreas Bulling
With an ever-increasing number of mobile devices competing for our attention, quantifying when, how often, or for how long users visually attend to their devices has emerged as a core challenge in mobile human-computer interaction.
2 code implementations • 12 May 2018 • Seonwook Park, Xucong Zhang, Andreas Bulling, Otmar Hilliges
Conventional feature-based and model-based gaze estimation methods have proven to perform well in settings with controlled illumination and specialized cameras.
1 code implementation • LREC 2018 • Arif Khan, Ingmar Steiner, Yusuke Sugano, Andreas Bulling, Ross Macdonald
Phonetic segmentation is the process of splitting speech into distinct phonetic units.
6 code implementations • 24 Nov 2017 • Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling
Second, we present an extensive evaluation of state-of-the-art gaze estimation methods on three current datasets, including MPIIGaze.
no code implementations • 19 Jun 2017 • Hosnieh Sattar, Mario Fritz, Andreas Bulling
Such visual decoding is challenging for two reasons: 1) the search target only resides in the user's mind as a subjective visual pattern, and can most often not even be described verbally by the person, and 2) it is, as of yet, unclear if gaze fixations contain sufficient information for this task at all.
no code implementations • 27 Apr 2017 • Erroll Wood, Tadas Baltrusaitis, Louis-Philippe Morency, Peter Robinson, Andreas Bulling
We present GazeDirector, a new approach for eye gaze redirection that uses model-fitting.
no code implementations • CVPR 2017 • Nour Karessli, Zeynep Akata, Bernt Schiele, Andreas Bulling
Zero-shot image classification using auxiliary information, such as attributes describing discriminative object properties, requires time-consuming annotation by domain experts.
no code implementations • 27 Nov 2016 • Hosnieh Sattar, Andreas Bulling, Mario Fritz
Predicting the target of visual search from eye fixation (gaze) data is a challenging problem with many applications in human-computer interaction.
4 code implementations • 27 Nov 2016 • Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling
Eye gaze is an important non-verbal cue for human affect analysis.
no code implementations • 8 Sep 2016 • Sabrina Hoppe, Andreas Bulling
Common computational methods for automated eye movement detection - i. e. the task of detecting different types of eye movement in a continuous stream of gaze data - are limited in that they either involve thresholding on hand-crafted signal features, require individual detectors each only detecting a single movement, or require pre-segmented data.
no code implementations • 18 Aug 2016 • Yusuke Sugano, Andreas Bulling
Gaze reflects how humans process visual scenes and is therefore increasingly used in computer vision systems.
no code implementations • 16 Feb 2016 • Sreyasi Nag Chowdhury, Mateusz Malinowski, Andreas Bulling, Mario Fritz
We show that our retrieval system can cope with this variability using personalisation through an online learning-based retrieval formulation.
no code implementations • 11 Jan 2016 • Mohsen Mansouryar, Julian Steil, Yusuke Sugano, Andreas Bulling
3D gaze information is important for scene-centric attention analysis but accurate estimation and analysis of 3D gaze in real-world environments remains challenging.
no code implementations • 18 Nov 2015 • Marc Tonsen, Xucong Zhang, Yusuke Sugano, Andreas Bulling
We further study the influence of image resolution, vision aids, as well as recording location (indoor, outdoor) on pupil detection performance.
no code implementations • ICCV 2015 • Erroll Wood, Tadas Baltrusaitis, Xucong Zhang, Yusuke Sugano, Peter Robinson, Andreas Bulling
Images of the eye are key in several computer vision problems, such as shape registration and gaze estimation.
no code implementations • 21 May 2015 • Iaroslav Shcherbatyi, Andreas Bulling, Mario Fritz
An increasing number of works explore collaborative human-computer systems in which human gaze is used to enhance computer vision systems.
6 code implementations • CVPR 2015 • Xucong Zhang, Yusuke Sugano, Mario Fritz, Andreas Bulling
Appearance-based gaze estimation is believed to work well in real-world settings, but existing datasets have been collected under controlled laboratory conditions and methods have been not evaluated across multiple datasets.
no code implementations • CVPR 2015 • Hosnieh Sattar, Sabine Müller, Mario Fritz, Andreas Bulling
Previous work on predicting the target of visual search from human fixations only considered closed-world settings in which training labels are available and predictions are performed for a known set of potential targets.
1 code implementation • 30 Apr 2014 • Moritz Kassner, William Patera, Andreas Bulling
Commercial head-mounted eye trackers provide useful features to customers in industry and research but are expensive and rely on closed source hardware and software.
no code implementations • 6 Mar 2014 • Mark Simkin, Dominique Schroeder, Andreas Bulling, Mario Fritz
We describe Ubic, a framework that allows users to bridge the gap between digital cryptography and the physical world.