1 code implementation • 26 Sep 2023 • Thomas Hummel, Otniel-Bogdan Mercea, A. Sophia Koepke, Zeynep Akata
We propose a framework for video-to-adverb retrieval (and vice versa) that aligns video embeddings with their matching compositional adverb-action text embedding in a joint embedding space.
1 code implementation • 7 Sep 2023 • Otniel-Bogdan Mercea, Thomas Hummel, A. Sophia Koepke, Zeynep Akata
Training deep learning models for video classification from audio-visual data commonly requires immense amounts of labeled training data collected via a costly process.
no code implementations • 6 Sep 2022 • Stephan Alaniz, Thomas Hummel, Zeynep Akata
Semantic image synthesis enables control over unconditional image generation by allowing guidance on what is being generated.
2 code implementations • 20 Jul 2022 • Otniel-Bogdan Mercea, Thomas Hummel, A. Sophia Koepke, Zeynep Akata
We show that our proposed framework that ingests temporal features yields state-of-the-art performance on the \ucf, \vgg, and \activity benchmarks for (generalised) zero-shot learning.
Ranked #2 on
GZSL Video Classification
on UCF-GZSL(main)
no code implementations • 4 May 2021 • Yanbei Chen, Thomas Hummel, A. Sophia Koepke, Zeynep Akata
Recent advances in XAI provide explanations for models trained on still images.
Explainable artificial intelligence
Explainable Artificial Intelligence (XAI)
+1
1 code implementation • 24 Jun 2020 • Stefan Heinrich, Yuan YAO, Tobias Hinz, Zhiyuan Liu, Thomas Hummel, Matthias Kerzel, Cornelius Weber, Stefan Wermter
From a neuroscientific perspective, natural language is embodied, grounded in most, if not all, sensory and sensorimotor modalities, and acquired by means of crossmodal integration.