Search Results for author: Adnen Abdessaied

Found 7 papers, 1 papers with code

V$^2$Dial: Unification of Video and Visual Dialog via Multimodal Experts

no code implementations3 Mar 2025 Adnen Abdessaied, Anna Rohrbach, Marcus Rohrbach, Andreas Bulling

We present V$^2$Dial - a novel expert-based model specifically geared towards simultaneously handling image and video input data for multimodal conversational tasks.

Contrastive Learning Text Retrieval +3

Multi-Modal Video Dialog State Tracking in the Wild

no code implementations2 Jul 2024 Adnen Abdessaied, Lei Shi, Andreas Bulling

Then, it predicts the missing underlying structure of the selected constituents of each modality by learning local latent graphs using a novel multi-modal graph structure learning method.

dialog state tracking Graph structure learning +2

Limits of Theory of Mind Modelling in Dialogue-Based Collaborative Plan Acquisition

no code implementations21 May 2024 Matteo Bortoletto, Constantin Ruhdorfer, Adnen Abdessaied, Lei Shi, Andreas Bulling

This finding calls for a deeper understanding of the role of ToM in CPA and beyond, as well as new methods for modelling and evaluating mental states in computational collaborative agents.

Collaborative Plan Acquisition

OLViT: Multi-Modal State Tracking via Attention-Based Embeddings for Video-Grounded Dialog

no code implementations20 Feb 2024 Adnen Abdessaied, Manuel von Hochmeister, Andreas Bulling

OLViT addresses these challenges by maintaining a global dialog state based on the output of an Object State Tracker (OST) and a Language State Tracker (LST): while the OST attends to the most important objects within the video, the LST keeps track of the most important linguistic co-references to previous dialog turns.

Object Object Tracking +2

$\mathbb{VD}$-$\mathbb{GR}$: Boosting $\mathbb{V}$isual $\mathbb{D}$ialog with Cascaded Spatial-Temporal Multi-Modal $\mathbb{GR}$aphs

no code implementations25 Oct 2023 Adnen Abdessaied, Lei Shi, Andreas Bulling

We propose $\mathbb{VD}$-$\mathbb{GR}$ - a novel visual dialog model that combines pre-trained language models (LMs) with graph neural networks (GNNs).

Visual Dialog

Neuro-Symbolic Visual Dialog

1 code implementation COLING 2022 Adnen Abdessaied, Mihai Bâce, Andreas Bulling

We propose Neuro-Symbolic Visual Dialog (NSVD) -the first method to combine deep learning and symbolic program execution for multi-round visually-grounded reasoning.

Question Answering

Cannot find the paper you are looking for? You can Submit a new open access paper.