no code implementations • 22 Nov 2024 • Karsten Roth, Zeynep Akata, Dima Damen, Ivana Balažević, Olivier J. Hénaff
Large-scale multimodal representation learning successfully optimizes for zero-shot transfer at test time.
no code implementations • 27 Oct 2024 • Shuchen Wu, Mirko Thalmann, Peter Dayan, Zeynep Akata, Eric Schulz
In contrast, large language models (LLMs) struggle to transfer abstract variables as effectively as humans.
no code implementations • 25 Oct 2024 • Leander Girrbach, Yiran Huang, Stephan Alaniz, Trevor Darrell, Zeynep Akata
Pre-trained large language models (LLMs) have been reliably integrated with visual input for multimodal tasks.
no code implementations • 23 Oct 2024 • Shyamgopal Karthik, Huseyin Coskun, Zeynep Akata, Sergey Tulyakov, Jian Ren, Anil Kag
In this work, we investigate a scalable approach for collecting large-scale and fully synthetic datasets for DPO training.
1 code implementation • 26 Aug 2024 • Karsten Roth, Vishaal Udandarao, Sebastian Dziadzio, Ameya Prabhu, Mehdi Cherti, Oriol Vinyals, Olivier Hénaff, Samuel Albanie, Matthias Bethge, Zeynep Akata
In this work, we complement current perspectives on continual pretraining through a research test bed as well as provide comprehensive guidance for effective continual model updates in such scenarios.
no code implementations • 25 Jul 2024 • Anders Christensen, Nooshin Mojab, Khushman Patel, Karan Ahuja, Zeynep Akata, Ole Winther, Mar Gonzalez-Franco, Andrea Colaco
Spherical or omni-directional images offer an immersive visual format appealing to a wide range of computer vision applications.
1 code implementation • 23 Jul 2024 • Thomas Hummel, Shyamgopal Karthik, Mariana-Iuliana Georgescu, Zeynep Akata
In this work, we introduce EgoCVR, a new evaluation benchmark for fine-grained Composed Video Retrieval using large-scale egocentric video datasets.
1 code implementation • 15 Jul 2024 • Jae Myung Kim, Jessica Bader, Stephan Alaniz, Cordelia Schmid, Zeynep Akata
While text-to-image diffusion models have been shown to achieve state-of-the-art results in image synthesis, they have yet to prove their effectiveness in downstream applications.
no code implementations • 10 Jul 2024 • Théo Uscidda, Luca Eyring, Karsten Roth, Fabian Theis, Zeynep Akata, Marco Cuturi
However, matching the prior while preserving geometric features is challenging, as a mapping that fully preserves these features while aligning the data distribution with the prior does not exist in general.
no code implementations • 3 Jul 2024 • Meghal Dani, Muthu Jeyanthi Prakash, Zeynep Akata, Stefanie Liebe
In summary, our work provides the first extensive benchmark comparing current SOTA LLMs in the medical domain of epilepsy and highlights their ability to leverage unstructured texts from patients' medical history to aid diagnostic processes in health care.
no code implementations • 13 Jun 2024 • Lukas Thede, Karsten Roth, Olivier J. Hénaff, Matthias Bethge, Zeynep Akata
(2) Indeed, we show how most often, P-RFCL techniques can be matched by a simple and lightweight PEFT baseline.
1 code implementation • 6 Jun 2024 • Luca Eyring, Shyamgopal Karthik, Karsten Roth, Alexey Dosovitskiy, Zeynep Akata
Moreover, given the same computational resources, a ReNO-optimized one-step model outperforms widely-used open-source models such as SDXL and PixArt-$\alpha$, highlighting the efficiency and effectiveness of ReNO in enhancing T2I model performance at inference time.
1 code implementation • 30 May 2024 • Massimo Bini, Karsten Roth, Zeynep Akata, Anna Khoreva
Parameter-efficient finetuning (PEFT) has become ubiquitous to adapt foundation models to downstream task requirements while retaining their generalization ability.
1 code implementation • 2 May 2024 • Nishad Singhi, Jae Myung Kim, Karsten Roth, Zeynep Akata
In this paper, we find that this is noticeably driven by an independent treatment of concepts during intervention, wherein a change of one concept does not influence the use of other ones in the model's final decision.
1 code implementation • 9 Apr 2024 • David Kurzendörfer, Otniel-Bogdan Mercea, A. Sophia Koepke, Zeynep Akata
However, existing benchmarks predate the popularization of large multi-modal models, such as CLIP and CLAP.
no code implementations • 21 Feb 2024 • Adrian Höhl, Ivica Obadic, Miguel Ángel Fernández Torres, Hiba Najjar, Dario Oliveira, Zeynep Akata, Andreas Dengel, Xiao Xiang Zhu
In recent years, black-box machine learning approaches have become a dominant modeling paradigm for knowledge extraction in remote sensing.
no code implementations • 5 Dec 2023 • Marcel Binz, Stephan Alaniz, Adina Roskies, Balazs Aczel, Carl T. Bergstrom, Colin Allen, Daniel Schad, Dirk Wulff, Jevin D. West, Qiong Zhang, Richard M. Shiffrin, Samuel J. Gershman, Ven Popov, Emily M. Bender, Marco Marelli, Matthew M. Botvinick, Zeynep Akata, Eric Schulz
For this opinion piece, we have invited four diverse groups of scientists to reflect on this query, sharing their perspectives and engaging in debate.
1 code implementation • 25 Nov 2023 • Luca Eyring, Dominik Klein, Théo Uscidda, Giovanni Palla, Niki Kilbertus, Zeynep Akata, Fabian Theis
We hence establish UOT-FM as a principled method for unpaired image translation.
1 code implementation • 14 Nov 2023 • Leonard Salewski, Stefan Fauth, A. Sophia Koepke, Zeynep Akata
In particular, our framework exploits a pre-trained large language model (LLM) for generating the text which is guided by a pre-trained audio-language model to produce captions that describe the audio content.
Ranked #1 on Zero-shot Audio Captioning on Clotho
1 code implementation • 8 Nov 2023 • Leonard Salewski, A. Sophia Koepke, Hendrik P. A. Lensch, Zeynep Akata
Converting a model's internals to text can yield human-understandable insights about the model.
1 code implementation • 26 Oct 2023 • Karsten Roth, Lukas Thede, Almut Sophia Koepke, Oriol Vinyals, Olivier Hénaff, Zeynep Akata
Training deep networks requires various design decisions regarding for instance their architecture, data augmentation, or optimization.
1 code implementation • 13 Oct 2023 • Shyamgopal Karthik, Karsten Roth, Massimiliano Mancini, Zeynep Akata
Finally, we show that CIReVL makes CIR human-understandable by composing image and text in a modular fashion in the language domain, thereby making it intervenable, allowing to post-hoc re-align failure cases.
Ranked #6 on Zero-Shot Composed Image Retrieval (ZS-CIR) on CIRCO
1 code implementation • 26 Sep 2023 • Thomas Hummel, Otniel-Bogdan Mercea, A. Sophia Koepke, Zeynep Akata
We propose a framework for video-to-adverb retrieval (and vice versa) that aligns video embeddings with their matching compositional adverb-action text embedding in a joint embedding space.
1 code implementation • 7 Sep 2023 • Otniel-Bogdan Mercea, Thomas Hummel, A. Sophia Koepke, Zeynep Akata
Training deep learning models for video classification from audio-visual data commonly requires immense amounts of labeled training data collected via a costly process.
1 code implementation • ICCV 2023 • Robert van der Klis, Stephan Alaniz, Massimiliano Mancini, Cassio F. Dantas, Dino Ienco, Zeynep Akata, Diego Marcos
Fine-grained classification often requires recognizing specific object parts, such as beak shape and wing patterns for birds.
1 code implementation • ICCV 2023 • Stephan Alaniz, Massimiliano Mancini, Zeynep Akata
We propose a framework, ISCO, to recompose an object using 3D superquadrics as semantic parts directly from 2D views without training a model that uses 3D supervision.
1 code implementation • 4 Sep 2023 • Meghal Dani, Isabel Rio-Torto, Stephan Alaniz, Zeynep Akata
We demonstrate that DeViL generates textual descriptions relevant to the image content on CC3M surpassing previous lightweight captioning models and attribution maps uncovering the learned concepts of the vision backbone.
1 code implementation • ICCV 2023 • Anders Christensen, Massimiliano Mancini, A. Sophia Koepke, Ole Winther, Zeynep Akata
We achieve this with our proposed Image-free Classifier Injection with Semantics (ICIS) that injects classifiers for new, unseen classes into pre-trained classification models in a post-hoc fashion without relying on image data.
1 code implementation • 20 Jul 2023 • Leander Girrbach, Anders Christensen, Ole Winther, Zeynep Akata, A. Sophia Koepke
Whilst this captures useful information for linear classifiers, we find that no relevant spatial structure is present in later layers of deep neural networks, making neural persistence roughly equivalent to the variance of weights.
1 code implementation • ICCV 2023 • Uddeshya Upadhyay, Shyamgopal Karthik, Massimiliano Mancini, Zeynep Akata
We propose ProbVLM, a probabilistic adapter that estimates probability distributions for the embeddings of pre-trained VLMs via inter/intra-modal alignment in a post-hoc manner without needing large-scale datasets or computing.
1 code implementation • ICCV 2023 • Karsten Roth, Jae Myung Kim, A. Sophia Koepke, Oriol Vinyals, Cordelia Schmid, Zeynep Akata
The visual classification performance of vision-language models such as CLIP has been shown to benefit from additional semantic knowledge from large language models (LLMs) such as GPT-3.
no code implementations • 27 May 2023 • Vikrant Rangnekar, Uddeshya Upadhyay, Zeynep Akata, Biplab Banerjee
Dense regression is a widely used approach in computer vision for tasks such as image super-resolution, enhancement, depth estimation, etc.
1 code implementation • NeurIPS 2023 • Leonard Salewski, Stephan Alaniz, Isabel Rio-Torto, Eric Schulz, Zeynep Akata
These findings demonstrate that LLMs are capable of taking on diverse roles and that this in-context impersonation can be used to uncover their hidden strengths and biases.
1 code implementation • 22 May 2023 • Shyamgopal Karthik, Karsten Roth, Massimiliano Mancini, Zeynep Akata
Despite their impressive capabilities, diffusion-based text-to-image (T2I) models can lack faithfulness to the text prompt, where generated images may not contain all the mentioned objects, attributes or relations.
no code implementations • 21 Apr 2023 • Julian Coda-Forno, Kristin Witte, Akshay K. Jagadish, Marcel Binz, Zeynep Akata, Eric Schulz
Anxiety-induction not only influences LLMs' scores on an anxiety questionnaire but also influences their behavior in a previously-established benchmark measuring biases such as racism and ageism.
no code implementations • 6 Apr 2023 • Jae Myung Kim, A. Sophia Koepke, Cordelia Schmid, Zeynep Akata
In this work, we introduce ODmAP@k, an object decorrelation metric that measures a model's robustness to spurious correlations in the training data.
2 code implementations • CVPR 2023 • Youngwook Kim, Jae Myung Kim, Jieun Jeong, Cordelia Schmid, Zeynep Akata, Jungwoo Lee
Based on these findings, we propose to boost the attribution scores of the model trained with partial labels to make its explanation resemble that of the model trained with full labels.
no code implementations • 24 Mar 2023 • Thilo Hagendorff, Ishita Dasgupta, Marcel Binz, Stephanie C. Y. Chan, Andrew Lampinen, Jane X. Wang, Zeynep Akata, Eric Schulz
Large language models (LLMs) show increasingly advanced emergent capabilities and are being incorporated across various societal domains.
no code implementations • 21 Feb 2023 • Uddeshya Upadhyay, Jae Myung Kim, Cordelia Schmidt, Bernhard Schölkopf, Zeynep Akata
Recent advances in deep learning have shown that uncertainty estimation is becoming increasingly important in applications such as medical imaging, natural language processing, and autonomous systems.
no code implementations • 15 Dec 2022 • Anurag Das, Yongqin Xian, Yang He, Zeynep Akata, Bernt Schiele
For best performance, today's semantic segmentation methods use large and carefully labeled datasets, requiring expensive annotation budgets.
1 code implementation • 23 Nov 2022 • Yuchen Ma, Yanbei Chen, Zeynep Akata
In this work, we formulate a new knowledge distillation framework to transfer the knowledge from self-supervised pre-trained models to any other student network by a novel approach named Embedding Graph Alignment.
no code implementations • 6 Nov 2022 • Zafir Stojanovski, Karsten Roth, Zeynep Akata
Large pre-trained, zero-shot capable models have shown considerable success both for standard transfer and adaptation tasks, with particular robustness towards distribution shifts.
1 code implementation • 25 Oct 2022 • Katrin Renz, Kashyap Chitta, Otniel-Bogdan Mercea, A. Sophia Koepke, Zeynep Akata, Andreas Geiger
Planning an optimal route in a complex environment requires efficient reasoning about the surrounding scene.
Ranked #6 on CARLA longest6 on CARLA
1 code implementation • 19 Oct 2022 • Abhra Chaudhuri, Massimiliano Mancini, Yanbei Chen, Zeynep Akata, Anjan Dutta
Representation learning for sketch-based image retrieval has mostly been tackled by learning embeddings that discard modality-specific information.
1 code implementation • 13 Oct 2022 • Karsten Roth, Mark Ibrahim, Zeynep Akata, Pascal Vincent, Diane Bouchacourt
We show that the use of HFS consistently facilitates disentanglement and recovery of ground-truth factors across a variety of correlation settings and benchmarks, even under severe training correlations and correlation shifts, with in parts over $+60\%$ in relative improvement over existing disentanglement methods.
1 code implementation • 5 Oct 2022 • Abhra Chaudhuri, Massimiliano Mancini, Zeynep Akata, Anjan Dutta
Fine-grained categories that largely share the same set of parts cannot be discriminated based on part information alone, as they mostly differ in the way the local parts relate to the overall global structure of the object.
no code implementations • 6 Sep 2022 • Stephan Alaniz, Thomas Hummel, Zeynep Akata
Semantic image synthesis enables control over unconditional image generation by allowing guidance on what is being generated.
no code implementations • 24 Aug 2022 • Yanbei Chen, Massimiliano Mancini, Xiatian Zhu, Zeynep Akata
Semi-supervised learning and unsupervised learning offer promising paradigms to learn from an abundance of unlabeled visual data.
1 code implementation • 27 Jul 2022 • Stephan Alaniz, Massimiliano Mancini, Anjan Dutta, Diego Marcos, Zeynep Akata
Toward equipping machines with such capabilities, we propose the Primitive-based Sketch Abstraction task where the goal is to represent sketches using a fixed set of drawing primitives under the influence of a budget.
2 code implementations • 20 Jul 2022 • Otniel-Bogdan Mercea, Thomas Hummel, A. Sophia Koepke, Zeynep Akata
We show that our proposed framework that ingests temporal features yields state-of-the-art performance on the \ucf, \vgg, and \activity benchmarks for (generalised) zero-shot learning.
Ranked #2 on GZSL Video Classification on UCF-GZSL(main)
1 code implementation • 14 Jul 2022 • Uddeshya Upadhyay, Shyamgopal Karthik, Yanbei Chen, Massimiliano Mancini, Zeynep Akata
Moreover, many of the high-performing deep learning models that are already trained and deployed are non-Bayesian in nature and do not provide uncertainty estimates.
1 code implementation • 8 Jul 2022 • Michael Kirchhof, Karsten Roth, Zeynep Akata, Enkelejda Kasneci
We model images as directional von Mises-Fisher (vMF) distributions on the hypersphere that can reflect image-intrinsic uncertainties.
1 code implementation • 15 Jun 2022 • Sebastian Bordt, Uddeshya Upadhyay, Zeynep Akata, Ulrike Von Luxburg
We propose a criterion: the feature attributions need to be aligned with the tangent space of the data manifold.
no code implementations • 13 Jun 2022 • Stephan Alaniz, Marco Federici, Zeynep Akata
Learning a common representation space between vision and language allows deep networks to relate objects in the image to the corresponding semantic meaning.
1 code implementation • CVPR 2022 • Youngwook Kim, Jae Myung Kim, Zeynep Akata, Jungwoo Lee
In this work, we first regard unobserved labels as negative labels, casting the WSML task into noisy multi-label classification.
1 code implementation • CVPR 2022 • Shyamgopal Karthik, Massimiliano Mancini, Zeynep Akata
The goal of open-world compositional zero-shot learning (OW-CZSL) is to recognize compositions of state and objects in images, given only a subset of them during training and no prior on the unseen compositions.
1 code implementation • 27 Apr 2022 • Ilke Cugu, Massimiliano Mancini, Yanbei Chen, Zeynep Akata
Generalizing visual recognition models trained on a single distribution to unseen input distributions (i. e. domains) requires making them robust to superfluous correlations in the training set.
1 code implementation • 12 Apr 2022 • Andrei Neculai, Yanbei Chen, Zeynep Akata
Without bells and whistles, we show that our probabilistic model formulation significantly outperforms existing related methods on multimodal image retrieval while generalizing well to query with different amounts of inputs given in arbitrary visual and (or) textual modalities.
1 code implementation • 5 Apr 2022 • Leonard Salewski, A. Sophia Koepke, Hendrik P. A. Lensch, Zeynep Akata
We present baseline results for generating natural language explanations in the context of VQA using two state-of-the-art frameworks on the CLEVR-X dataset.
Ranked #1 on Explanation Generation on CLEVR-X
no code implementations • 4 Apr 2022 • Wenjia Xu, Yongqin Xian, Jiuniu Wang, Bernt Schiele, Zeynep Akata
While a visual-semantic embedding layer learns global features, local features are learned through an attribute prototype network that simultaneously regresses and decorrelates attributes from intermediate features.
Ranked #5 on GZSL Video Classification on ActivityNet-GZSL(main)
1 code implementation • CVPR 2022 • Wenjia Xu, Yongqin Xian, Jiuniu Wang, Bernt Schiele, Zeynep Akata
Our model visually divides a set of images from seen classes into clusters of local image regions according to their visual similarity, and further imposes their class discrimination and semantic relatedness.
1 code implementation • CVPR 2022 • Karsten Roth, Oriol Vinyals, Zeynep Akata
Deep Metric Learning (DML) aims to learn representation spaces on which semantic relations can simply be expressed through predefined distance metrics.
Ranked #11 on Metric Learning on CUB-200-2011 (using extra training data)
1 code implementation • CVPR 2022 • Karsten Roth, Oriol Vinyals, Zeynep Akata
This causes learned embedding spaces to encode incomplete semantic context and misrepresent the semantic relation between classes, impacting the generalizability of the learned metric space.
Ranked #8 on Metric Learning on CARS196 (using extra training data)
1 code implementation • CVPR 2022 • Otniel-Bogdan Mercea, Lukas Riesch, A. Sophia Koepke, Zeynep Akata
Focusing on the relatively underexplored task of audio-visual zero-shot learning, we propose to learn multi-modal representations from audio-visual data using cross-modal attention and exploit textual label embeddings for transferring knowledge from seen classes to unseen classes.
Ranked #1 on ZSL Video Classification on UCF-GZSL (cls)
no code implementations • 17 Jan 2022 • Ushasi Chaudhuri, Ruchika Chavan, Biplab Banerjee, Anjan Dutta, Zeynep Akata
The efficacy of zero-shot sketch-based image retrieval (ZS-SBIR) models is governed by two challenges.
1 code implementation • 17 Dec 2021 • A. Sophia Koepke, Andreea-Maria Oncescu, João F. Henriques, Zeynep Akata, Samuel Albanie
Additionally, we introduce the SoundDescs benchmark, which consists of paired audio and natural language descriptions for a diverse collection of sounds that are complementary to those found in AudioCaps and Clotho.
Ranked #1 on Audio to Text Retrieval on SoundDescs
1 code implementation • 2 Nov 2021 • Yao Rong, Wenjia Xu, Zeynep Akata, Enkelejda Kasneci
The way humans attend to, process and classify a given image has the potential to vastly benefit the performance of deep learning models.
Ranked #44 on Fine-Grained Image Classification on CUB-200-2011
1 code implementation • NeurIPS 2021 • Uddeshya Upadhyay, Yanbei Chen, Zeynep Akata
Unpaired image-to-image translation refers to learning inter-image-domain mapping without corresponding image pairs.
no code implementations • 18 Oct 2021 • Lennart Alexander Van der Goten, Tobias Hepp, Zeynep Akata, Kevin Smith
Solutions have been developed to de-identify diagnostic scans by obfuscating or removing parts of the face.
no code implementations • 29 Sep 2021 • Jae Myung Kim, Eunji Kim, Sungroh Yoon, Jungwoo Lee, Cordelia Schmid, Zeynep Akata
Explaining a complex black-box system in a post-hoc manner is important to understand its predictions.
1 code implementation • NeurIPS 2021 • Sarkhan Badirli, Zeynep Akata, George Mohler, Christine Picard, Murat Dundar
Fine-grained zero-shot learning task requires some form of side-information to transfer discriminative information from seen to unseen classes.
no code implementations • 19 Aug 2021 • Anjan Dutta, Massimiliano Mancini, Zeynep Akata
Existing self-supervised learning methods learn representation by means of pretext tasks which are either (1) discriminating that explicitly specify which features should be separated or (2) aligning that precisely indicate which features should be closed together, but ignore the fact how to jointly and principally define which features to be repelled and which ones to be attracted.
1 code implementation • 29 Jun 2021 • Uddeshya Upadhyay, Yanbei Chen, Tobias Hepp, Sergios Gatidis, Zeynep Akata
However, the state-of-the-art GAN-based frameworks do not estimate the uncertainty in the predictions made by the network that is essential for making informed medical decisions and subsequent revision by medical experts and has recently been shown to improve the performance and interpretability of the model.
1 code implementation • ICCV 2021 • Jae Myung Kim, Junsuk Choe, Zeynep Akata, Seong Joon Oh
The class activation mapping, or CAM, has been the cornerstone of feature attribution methods for multiple vision tasks.
2 code implementations • ICCV 2021 • Maxime Kayser, Oana-Maria Camburu, Leonard Salewski, Cornelius Emde, Virginie Do, Zeynep Akata, Thomas Lukasiewicz
e-ViL is a benchmark for explainable vision-language tasks that establishes a unified evaluation framework and provides the first comprehensive comparison of existing approaches that generate NLEs for VL tasks.
1 code implementation • 5 May 2021 • Andreea-Maria Oncescu, A. Sophia Koepke, João F. Henriques, Zeynep Akata, Samuel Albanie
We consider the task of retrieving audio using free-form natural language queries.
Ranked #1 on Audio/Video to Text Retrieval on AudioCaps
no code implementations • 4 May 2021 • Yanbei Chen, Thomas Hummel, A. Sophia Koepke, Zeynep Akata
Recent advances in XAI provide explanations for models trained on still images.
Explainable artificial intelligence Explainable Artificial Intelligence (XAI) +1
2 code implementations • 3 May 2021 • Massimiliano Mancini, Muhammad Ferjad Naeem, Yongqin Xian, Zeynep Akata
In this work, we overcome this assumption operating on the open world setting, where no limit is imposed on the compositional space at test time, and the search space contains a large number of unseen compositions.
1 code implementation • CVPR 2021 • Yanbei Chen, Yongqin Xian, A. Sophia Koepke, Ying Shan, Zeynep Akata
Having access to multi-modal cues (e. g. vision and audio) empowers some cognitive tasks to be done faster compared to learning from a single modality.
1 code implementation • 21 Apr 2021 • Giuseppe Pastore, Fabio Cermelli, Yongqin Xian, Massimiliano Mancini, Zeynep Akata, Barbara Caputo
Being able to segment unseen classes not observed during training is an important technical challenge in deep learning, because of its potential to reduce the expensive annotation required for semantic segmentation.
Ranked #8 on Zero-Shot Semantic Segmentation on PASCAL VOC
1 code implementation • 23 Feb 2021 • Uddeshya Upadhyay, Yanbei Chen, Zeynep Akata
Unpaired image-to-image translation refers to learning inter-image-domain mapping in an unsupervised manner.
1 code implementation • CVPR 2021 • Muhammad Ferjad Naeem, Yongqin Xian, Federico Tombari, Zeynep Akata
In compositional zero-shot learning, the goal is to recognize unseen compositions (e. g. old dog) of observed visual primitives states (e. g. old, cute) and objects (e. g. car, dog) in the training set.
2 code implementations • CVPR 2021 • Massimiliano Mancini, Muhammad Ferjad Naeem, Yongqin Xian, Zeynep Akata
After estimating the feasibility score of each composition, we use these scores to either directly mask the output space or as a margin for the cosine similarity between visual features and compositional embeddings during training.
no code implementations • 1 Jan 2021 • Lennart Alexander Van der Goten, Tobias Hepp, Zeynep Akata, Kevin Smith
De-identification of magnetic resonance imagery (MRI) is intrinsically difficult since, even with all metadata removed, a person's face can easily be rendered and matched against a database.
1 code implementation • 30 Nov 2020 • Fabio Cermelli, Massimiliano Mancini, Yongqin Xian, Zeynep Akata, Barbara Caputo
Semantic segmentation models have two fundamental weaknesses: i) they require large training sets with costly pixel-level annotations, and ii) they have a static output space, constrained to the classes of the training set.
no code implementations • NeurIPS 2020 • Wenjia Xu, Yongqin Xian, Jiuniu Wang, Bernt Schiele, Zeynep Akata
As an additional benefit, our model points to the visual evidence of the attributes in an image, e. g. for the CUB dataset, confirming the improved attribute localization ability of our image representation.
1 code implementation • ECCV 2020 • Massimiliano Mancini, Zeynep Akata, Elisa Ricci, Barbara Caputo
The key idea of CuMix is to simulate the test-time domain and semantic shift using images and features from unseen domains and categories generated by mixing up the multiple source domains and categories available during training.
1 code implementation • 9 Jul 2020 • Yongqin Xian, Bruno Korbar, Matthijs Douze, Lorenzo Torresani, Bernt Schiele, Zeynep Akata
Few-shot learning aims to recognize novel classes from a few examples.
2 code implementations • 8 Jul 2020 • Junsuk Choe, Seong Joon Oh, Sanghyuk Chun, Seungho Lee, Zeynep Akata, Hyunjung Shim
In this paper, we argue that WSOL task is ill-posed with only image-level labels, and propose a new evaluation protocol where full supervision is limited to only a small held-out set not overlapping with the test set.
no code implementations • 20 Jun 2020 • Anjan Dutta, Zeynep Akata
Low-shot sketch-based image retrieval is an emerging task in computer vision, allowing to retrieve natural images relevant to hand-drawn sketch queries that are rarely seen during the training phase.
1 code implementation • 20 Jun 2020 • Yao Rong, Zeynep Akata, Enkelejda Kasneci
Numerous car accidents are caused by improper driving maneuvers.
3 code implementations • 7 Apr 2020 • Virginie Do, Oana-Maria Camburu, Zeynep Akata, Thomas Lukasiewicz
The recently proposed SNLI-VE corpus for recognising visual-textual entailment is a large, real-world dataset for fine-grained multimodal reasoning.
3 code implementations • ICLR 2020 • Marco Federici, Anjan Dutta, Patrick Forré, Nate Kushman, Zeynep Akata
This enables us to identify superfluous information as that not shared by both views.
2 code implementations • CVPR 2020 • Junsuk Choe, Seong Joon Oh, Seungho Lee, Sanghyuk Chun, Zeynep Akata, Hyunjung Shim
In this paper, we argue that WSOL task is ill-posed with only image-level labels, and propose a new evaluation protocol where full supervision is limited to only a small held-out set not overlapping with the test set.
1 code implementation • 15 Oct 2019 • Sadaf Gulshad, Zeynep Akata, Jan Hendrik Metzen, Arnold Smeulders
We study the changes in attributes for clean as well as adversarial images in both standard and adversarially robust networks.
1 code implementation • NeurIPS 2019 • Rodolfo Corona, Stephan Alaniz, Zeynep Akata
An agent who interacts with a wide population of other agents needs to be aware that there may be variations in their understanding of the world.
no code implementations • 22 Jul 2019 • Xiahan Shi, Leonard Salewski, Martin Schiegg, Zeynep Akata, Max Welling
Instead, we consider the extended setup of generalized few-shot learning (GFSL), where the model is required to perform classification on the joint label space consisting of both previously seen and novel classes.
1 code implementation • 22 Jul 2019 • Sarkhan Badirli, Zeynep Akata, Murat Dundar
Object classes that surround us have a natural tendency to emerge at varying levels of abstraction.
1 code implementation • NeurIPS 2019 • Victor Garcia Satorras, Zeynep Akata, Max Welling
A graphical model is a structured representation of the data generating process.
1 code implementation • 17 Apr 2019 • Sadaf Gulshad, Jan Hendrik Metzen, Arnold Smeulders, Zeynep Akata
Deep computer vision systems being vulnerable to imperceptible and carefully crafted noise have raised questions regarding the robustness of their decisions.
no code implementations • CVPR 2019 • Yongqin Xian, Saurabh Sharma, Bernt Schiele, Zeynep Akata
When labeled training data is scarce, a promising data augmentation approach is to generate visual features of unknown classes using their attributes.
Ranked #4 on Generalized Zero-Shot Learning on SUN Attribute
no code implementations • ICLR Workshop LLD 2019 • Edgar Schönfeld, Sayna Ebrahimi, Samarth Sinha, Trevor Darrell, Zeynep Akata
While following the same direction, we also take artificial feature generation one step further and propose a model where a shared latent space of image features and class embeddings is learned by aligned variational autoencoders, for the purpose of generating latent features to train a softmax classifier.
1 code implementation • CVPR 2019 • Anjan Dutta, Zeynep Akata
Existing works either require aligned sketch-image pairs or inefficient memory fusion layer for mapping the visual information to a semantic space.
no code implementations • CVPR 2021 • Stephan Alaniz, Diego Marcos, Bernt Schiele, Zeynep Akata
Integrated interpretability without sacrificing the prediction accuracy of decision making algorithms has the potential of greatly improving their value to the user.
no code implementations • 1 Feb 2019 • Laurens Weitkamp, Elise van der Pol, Zeynep Akata
Due to the capability of deep learning to perform well in high dimensional problems, deep reinforcement learning agents perform well in challenging tasks such as Atari 2600 games.
2 code implementations • 5 Dec 2018 • Edgar Schönfeld, Sayna Ebrahimi, Samarth Sinha, Trevor Darrell, Zeynep Akata
Many approaches in generalized zero-shot learning rely on cross-modal mapping between the image feature space and the class embedding space.
Ranked #2 on Generalized Few-Shot Learning on AwA2
no code implementations • 22 Aug 2018 • Levent Karacan, Zeynep Akata, Aykut Erdem, Erkut Erdem
In this study, we explore building a two-stage framework for enabling users to directly manipulate high-level attributes of a natural scene.
2 code implementations • ECCV 2018 • Jinkyu Kim, Anna Rohrbach, Trevor Darrell, John Canny, Zeynep Akata
Finally, we explore a version of our model that generates rationalizations, and compare with introspective explanations on the same video segments.
no code implementations • ECCV 2018 • Lisa Anne Hendricks, Ronghang Hu, Trevor Darrell, Zeynep Akata
Our model improves the textual explanation quality of fine-grained classification decisions on the CUB dataset by mentioning phrases that are grounded in the image.
no code implementations • 26 Jun 2018 • Lisa Anne Hendricks, Ronghang Hu, Trevor Darrell, Zeynep Akata
We call such textual explanations counterfactual explanations, and propose an intuitive method to generate counterfactual explanations by inspecting which evidence in an input is missing, but might contribute to a different classification decision if present in the image.
no code implementations • 24 May 2018 • Mevlana Gemici, Zeynep Akata, Max Welling
We introduce Primal-Dual Wasserstein GAN, a new learning algorithm for building latent variable models of the data distribution based on the primal and the dual formulations of the optimal transport (OT) problem.
1 code implementation • CVPR 2018 • Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Anna Rohrbach, Bernt Schiele, Trevor Darrell, Marcus Rohrbach
We propose a multimodal approach to explanation, and argue that the two modalities provide complementary explanatory strengths.
4 code implementations • CVPR 2018 • Yongqin Xian, Tobias Lorenz, Bernt Schiele, Zeynep Akata
Suffering from the extreme training data imbalance between seen and unseen classes, most of existing state-of-the-art approaches fail to achieve satisfactory results for the challenging generalized zero-shot learning task.
Ranked #6 on Generalized Zero-Shot Learning on SUN Attribute
Generalized Zero-Shot Learning Generative Adversarial Network
no code implementations • 17 Nov 2017 • Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Anna Rohrbach, Bernt Schiele, Trevor Darrell, Marcus Rohrbach
We also introduce a multimodal methodology for generating visual and textual explanations simultaneously.
no code implementations • 17 Nov 2017 • Lisa Anne Hendricks, Ronghang Hu, Trevor Darrell, Zeynep Akata
Existing models which generate textual explanations enforce task relevance through a discriminative term loss function, but such mechanisms only weakly constrain mentioned object parts to actually be present in the image.
10 code implementations • 3 Jul 2017 • Yongqin Xian, Christoph H. Lampert, Bernt Schiele, Zeynep Akata
Due to the importance of zero-shot learning, i. e. classifying images where there is a lack of labeled training data, the number of proposed approaches has recently increased steadily.
1 code implementation • CVPR 2017 • Yongqin Xian, Bernt Schiele, Zeynep Akata
Due to the importance of zero-shot learning, the number of proposed approaches has increased steadily recently.
no code implementations • CVPR 2017 • Seong Joon Oh, Rodrigo Benenson, Anna Khoreva, Zeynep Akata, Mario Fritz, Bernt Schiele
We show how to combine both information sources in order to recover 80% of the fully supervised performance - which is the new state of the art in weakly supervised training for pixel-wise semantic labelling.
Ranked #26 on Semantic Segmentation on PASCAL VOC 2012 val
no code implementations • 14 Dec 2016 • Dong Huk Park, Lisa Anne Hendricks, Zeynep Akata, Bernt Schiele, Trevor Darrell, Marcus Rohrbach
In contrast, humans can justify their decisions with natural language and point to the evidence in the visual world which led to their decisions.
no code implementations • 1 Dec 2016 • Levent Karacan, Zeynep Akata, Aykut Erdem, Erkut Erdem
Automatic image synthesis research has been rapidly growing with deep networks getting more and more expressive.
no code implementations • CVPR 2017 • Nour Karessli, Zeynep Akata, Bernt Schiele, Andreas Bulling
Zero-shot image classification using auxiliary information, such as attributes describing discriminative object properties, requires time-consuming annotation by domain experts.
no code implementations • NeurIPS 2016 • Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee
Generative Adversarial Networks (GANs) have recently demonstrated the capability to synthesize compelling real-world images, such as room interiors, album covers, manga, faces, birds, and flowers.
Ranked #14 on Text-to-Image Generation on CUB (using extra training data)
9 code implementations • CVPR 2016 • Scott Reed, Zeynep Akata, Bernt Schiele, Honglak Lee
State-of-the-art methods for zero-shot visual recognition formulate learning as a joint embedding problem of images and side information.
39 code implementations • 17 May 2016 • Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee
Automatic synthesis of realistic images from text would be interesting and useful, but current AI systems are still far from this goal.
no code implementations • CVPR 2016 • Zeynep Akata, Mateusz Malinowski, Mario Fritz, Bernt Schiele
A promising research direction is zero-shot learning, which does not require any training data to recognize new classes, but rather relies on some form of auxiliary information describing the new classes.
no code implementations • CVPR 2016 • Yongqin Xian, Zeynep Akata, Gaurav Sharma, Quynh Nguyen, Matthias Hein, Bernt Schiele
We train the model with a ranking based objective function which penalizes incorrect rankings of the true class for a given image.
no code implementations • 28 Mar 2016 • Lisa Anne Hendricks, Zeynep Akata, Marcus Rohrbach, Jeff Donahue, Bernt Schiele, Trevor Darrell
Clearly explaining a rationale for a classification decision to an end-user can be as important as the decision itself.
2 code implementations • 30 Mar 2015 • Zeynep Akata, Florent Perronnin, Zaid Harchaoui, Cordelia Schmid
Attributes act as intermediate representations that enable parameter sharing between classes, a must when training data is scarce.
Ranked #7 on Multi-label zero-shot learning on Open Images V4
2 code implementations • CVPR 2015 • Zeynep Akata, Scott Reed, Daniel Walter, Honglak Lee, Bernt Schiele
Image classification has advanced significantly in recent years with the availability of large-scale image sets.
no code implementations • CVPR 2013 • Zeynep Akata, Florent Perronnin, Zaid Harchaoui, Cordelia Schmid
The label embedding framework offers other advantages such as the ability to leverage alternative sources of information in addition to attributes (e. g. class hierarchies) or to transition smoothly from zero-shot learning to learning with large quantities of data.