Search Results for author: Simon Ging

Found 3 papers, 3 papers with code

Open-ended VQA benchmarking of Vision-Language models by exploiting Classification datasets and their semantic hierarchy

1 code implementation • 11 Feb 2024 • Simon Ging, María A. Bravo, Thomas Brox

The evaluation of text-generative vision-language models is a challenging yet crucial endeavor.

Paper
Code

Open-vocabulary Attribute Detection

1 code implementation • CVPR 2023 • María A. Bravo, Sudhanshu Mittal, Simon Ging, Thomas Brox

The objective of the novel task and benchmark is to probe object-level attribute information learned by vision-language models.

Ranked #2 on Open Vocabulary Attribute Detection on OVAD benchmark

Attribute Language Modelling +2

Paper
Code

COOT: Cooperative Hierarchical Transformer for Video-Text Representation Learning

1 code implementation • NeurIPS 2020 • Simon Ging, Mohammadreza Zolfaghari, Hamed Pirsiavash, Thomas Brox

Many real-world video-text tasks involve different levels of granularity, such as frames and words, clip and sentences or videos and paragraphs, each with distinct semantics.

Ranked #4 on Video Captioning on ActivityNet Captions

Cross-Modal Retrieval Representation Learning +2

286

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.