Search Results for author: Leonard Salewski

Found 7 papers, 6 papers with code

Zero-shot audio captioning with audio-language model guidance and audio context keywords

1 code implementation • 14 Nov 2023 • Leonard Salewski, Stefan Fauth, A. Sophia Koepke, Zeynep Akata

In particular, our framework exploits a pre-trained large language model (LLM) for generating the text which is guided by a pre-trained audio-language model to produce captions that describe the audio content.

Ranked #1 on Zero-shot Audio Captioning on Clotho

Descriptive Image Captioning +5

Paper
Code

Zero-shot Translation of Attention Patterns in VQA Models to Natural Language

1 code implementation • 8 Nov 2023 • Leonard Salewski, A. Sophia Koepke, Hendrik P. A. Lensch, Zeynep Akata

Converting a model's internals to text can yield human-understandable insights about the model.

Image Captioning Language Modelling +3

Paper
Code

In-Context Impersonation Reveals Large Language Models' Strengths and Biases

1 code implementation • NeurIPS 2023 • Leonard Salewski, Stephan Alaniz, Isabel Rio-Torto, Eric Schulz, Zeynep Akata

These findings demonstrate that LLMs are capable of taking on diverse roles and that this in-context impersonation can be used to uncover their hidden strengths and biases.

Paper
Code

Diverse Video Captioning by Adaptive Spatio-temporal Attention

1 code implementation • 19 Aug 2022 • Zohreh Ghaderi, Leonard Salewski, Hendrik P. A. Lensch

To generate proper captions for videos, the inference needs to identify relevant concepts and pay attention to the spatial relationships between them as well as to the temporal development in the clip.

Ranked #7 on Video Captioning on VATEX

Text Generation Video Captioning

Paper
Code

CLEVR-X: A Visual Reasoning Dataset for Natural Language Explanations

1 code implementation • 5 Apr 2022 • Leonard Salewski, A. Sophia Koepke, Hendrik P. A. Lensch, Zeynep Akata

We present baseline results for generating natural language explanations in the context of VQA using two state-of-the-art frameworks on the CLEVR-X dataset.

Ranked #1 on Explanation Generation on CLEVR-X

Explanation Generation Question Answering +3

Paper
Code

e-ViL: A Dataset and Benchmark for Natural Language Explanations in Vision-Language Tasks

2 code implementations • ICCV 2021 • Maxime Kayser, Oana-Maria Camburu, Leonard Salewski, Cornelius Emde, Virginie Do, Zeynep Akata, Thomas Lukasiewicz

e-ViL is a benchmark for explainable vision-language tasks that establishes a unified evaluation framework and provides the first comprehensive comparison of existing approaches that generate NLEs for VL tasks.

Language Modelling Text Generation

Paper
Code

Relational Generalized Few-Shot Learning

no code implementations • 22 Jul 2019 • Xiahan Shi, Leonard Salewski, Martin Schiegg, Zeynep Akata, Max Welling

Instead, we consider the extended setup of generalized few-shot learning (GFSL), where the model is required to perform classification on the joint label space consisting of both previously seen and novel classes.

Few-Shot Learning Generalized Few-Shot Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.