Search Results for author: Massimiliano Mancini

Found 45 papers, 29 papers with code

OpenBias: Open-set Bias Detection in Text-to-Image Generative Models

no code implementations11 Apr 2024 Moreno D'Incà, Elia Peruzzo, Massimiliano Mancini, Dejia Xu, Vidit Goel, Xingqian Xu, Zhangyang Wang, Humphrey Shi, Nicu Sebe

In this paper, we tackle the challenge of open-set bias detection in text-to-image generative models presenting OpenBias, a new pipeline that identifies and quantifies the severity of biases agnostically, without access to any precompiled set.

MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning

1 code implementation8 Apr 2024 Matteo Farina, Massimiliano Mancini, Elia Cunegatti, Gaowen Liu, Giovanni Iacca, Elisa Ricci

In this challenging setting, the transferable representations already encoded in the pretrained model are a key aspect to preserve.

Transfer Learning

Vision-by-Language for Training-Free Compositional Image Retrieval

1 code implementation13 Oct 2023 Shyamgopal Karthik, Karsten Roth, Massimiliano Mancini, Zeynep Akata

Finally, we show that CIReVL makes CIR human-understandable by composing image and text in a modular fashion in the language domain, thereby making it intervenable, allowing to post-hoc re-align failure cases.

Image Retrieval Retrieval +1

Iterative Superquadric Recomposition of 3D Objects from Multiple Views

1 code implementation ICCV 2023 Stephan Alaniz, Massimiliano Mancini, Zeynep Akata

We propose a framework, ISCO, to recompose an object using 3D superquadrics as semantic parts directly from 2D views without training a model that uses 3D supervision.

Inductive Bias Object

Image-free Classifier Injection for Zero-Shot Classification

1 code implementation ICCV 2023 Anders Christensen, Massimiliano Mancini, A. Sophia Koepke, Ole Winther, Zeynep Akata

We achieve this with our proposed Image-free Classifier Injection with Semantics (ICIS) that injects classifiers for new, unseen classes into pre-trained classification models in a post-hoc fashion without relying on image data.

Classification Image Classification +1

On the Effectiveness of LayerNorm Tuning for Continual Learning in Vision Transformers

1 code implementation18 Aug 2023 Thomas De Min, Massimiliano Mancini, Karteek Alahari, Xavier Alameda-Pineda, Elisa Ricci

State-of-the-art rehearsal-free continual learning methods exploit the peculiarities of Vision Transformers to learn task-specific prompts, drastically reducing catastrophic forgetting.

Continual Learning Transfer Learning

ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models

1 code implementation ICCV 2023 Uddeshya Upadhyay, Shyamgopal Karthik, Massimiliano Mancini, Zeynep Akata

We propose ProbVLM, a probabilistic adapter that estimates probability distributions for the embeddings of pre-trained VLMs via inter/intra-modal alignment in a post-hoc manner without needing large-scale datasets or computing.

Active Learning Model Selection +1

Vocabulary-free Image Classification

1 code implementation NeurIPS 2023 Alessandro Conti, Enrico Fini, Massimiliano Mancini, Paolo Rota, Yiming Wang, Elisa Ricci

We thus formalize a novel task, termed as Vocabulary-free Image Classification (VIC), where we aim to assign to an input image a class that resides in an unconstrained language-induced semantic space, without the prerequisite of a known vocabulary.

Classification Image Classification +4

If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection

1 code implementation22 May 2023 Shyamgopal Karthik, Karsten Roth, Massimiliano Mancini, Zeynep Akata

Despite their impressive capabilities, diffusion-based text-to-image (T2I) models can lack faithfulness to the text prompt, where generated images may not contain all the mentioned objects, attributes or relations.

Text-to-Image Generation

Cross-Modal Fusion Distillation for Fine-Grained Sketch-Based Image Retrieval

1 code implementation19 Oct 2022 Abhra Chaudhuri, Massimiliano Mancini, Yanbei Chen, Zeynep Akata, Anjan Dutta

Representation learning for sketch-based image retrieval has mostly been tackled by learning embeddings that discard modality-specific information.

Cross-Modal Retrieval Knowledge Distillation +3

Relational Proxies: Emergent Relationships as Fine-Grained Discriminators

1 code implementation5 Oct 2022 Abhra Chaudhuri, Massimiliano Mancini, Zeynep Akata, Anjan Dutta

Fine-grained categories that largely share the same set of parts cannot be discriminated based on part information alone, as they mostly differ in the way the local parts relate to the overall global structure of the object.

Semi-Supervised and Unsupervised Deep Visual Learning: A Survey

no code implementations24 Aug 2022 Yanbei Chen, Massimiliano Mancini, Xiatian Zhu, Zeynep Akata

Semi-supervised learning and unsupervised learning offer promising paradigms to learn from an abundance of unlabeled visual data.

Abstracting Sketches through Simple Primitives

1 code implementation27 Jul 2022 Stephan Alaniz, Massimiliano Mancini, Anjan Dutta, Diego Marcos, Zeynep Akata

Toward equipping machines with such capabilities, we propose the Primitive-based Sketch Abstraction task where the goal is to represent sketches using a fixed set of drawing primitives under the influence of a budget.

Retrieval Sketch-Based Image Retrieval +1

BayesCap: Bayesian Identity Cap for Calibrated Uncertainty in Frozen Neural Networks

1 code implementation14 Jul 2022 Uddeshya Upadhyay, Shyamgopal Karthik, Yanbei Chen, Massimiliano Mancini, Zeynep Akata

Moreover, many of the high-performing deep learning models that are already trained and deployed are non-Bayesian in nature and do not provide uncertainty estimates.

Autonomous Driving Deblurring +2

KG-SP: Knowledge Guided Simple Primitives for Open World Compositional Zero-Shot Learning

1 code implementation CVPR 2022 Shyamgopal Karthik, Massimiliano Mancini, Zeynep Akata

The goal of open-world compositional zero-shot learning (OW-CZSL) is to recognize compositions of state and objects in images, given only a subset of them during training and no prior on the unseen compositions.

Compositional Zero-Shot Learning Missing Labels

Attention Consistency on Visual Corruptions for Single-Source Domain Generalization

1 code implementation27 Apr 2022 Ilke Cugu, Massimiliano Mancini, Yanbei Chen, Zeynep Akata

Generalizing visual recognition models trained on a single distribution to unseen input distributions (i. e. domains) requires making them robust to superfluous correlations in the training set.

Domain Generalization

Modeling the Background for Incremental and Weakly-Supervised Semantic Segmentation

1 code implementation31 Jan 2022 Fabio Cermelli, Massimiliano Mancini, Samuel Rota Buló, Elisa Ricci, Barbara Caputo

To tackle these issues, we introduce a novel incremental class learning approach for semantic segmentation taking into account a peculiar aspect of this task: since each training step provides annotation only for a subset of all possible classes, pixels of the background class exhibit a semantic shift.

Segmentation Weakly supervised segmentation +2

Concurrent Discrimination and Alignment for Self-Supervised Feature Learning

no code implementations19 Aug 2021 Anjan Dutta, Massimiliano Mancini, Zeynep Akata

Existing self-supervised learning methods learn representation by means of pretext tasks which are either (1) discriminating that explicitly specify which features should be separated or (2) aligning that precisely indicate which features should be closed together, but ignore the fact how to jointly and principally define which features to be repelled and which ones to be attracted.

Self-Supervised Learning Semantic Segmentation +1

On the Challenges of Open World Recognitionunder Shifting Visual Domains

1 code implementation9 Jul 2021 Dario Fontanel, Fabio Cermelli, Massimiliano Mancini, Barbara Caputo

Robotic visual systems operating in the wild must act in unconstrained scenarios, under different environmental conditions while facing a variety of semantic concepts, including unknown ones.

Domain Generalization Object Recognition

Detecting Anomalies in Semantic Segmentation with Prototypes

1 code implementation1 Jun 2021 Dario Fontanel, Fabio Cermelli, Massimiliano Mancini, Barbara Caputo

Current state of the art of anomaly segmentation uses generative models, exploiting their incapability to reconstruct patterns unseen during training.

Segmentation Semantic Segmentation

Learning Graph Embeddings for Open World Compositional Zero-Shot Learning

2 code implementations3 May 2021 Massimiliano Mancini, Muhammad Ferjad Naeem, Yongqin Xian, Zeynep Akata

In this work, we overcome this assumption operating on the open world setting, where no limit is imposed on the compositional space at test time, and the search space contains a large number of unseen compositions.

Compositional Zero-Shot Learning

Cluster-driven Graph Federated Learning over Multiple Domains

no code implementations29 Apr 2021 Debora Caldarola, Massimiliano Mancini, Fabio Galasso, Marco Ciccone, Emanuele Rodolà, Barbara Caputo

Clustering may reduce heterogeneity by identifying the domains, but it deprives each cluster model of the data and supervision of others.

Clustering Federated Learning

A Closer Look at Self-training for Zero-Label Semantic Segmentation

1 code implementation21 Apr 2021 Giuseppe Pastore, Fabio Cermelli, Yongqin Xian, Massimiliano Mancini, Zeynep Akata, Barbara Caputo

Being able to segment unseen classes not observed during training is an important technical challenge in deep learning, because of its potential to reduce the expensive annotation required for semantic segmentation.

Segmentation Semantic Segmentation

Inferring Latent Domains for Unsupervised Deep Domain Adaptation

no code implementations25 Mar 2021 Massimiliano Mancini, Lorenzo Porzi, Samuel Rota Bulò, Barbara Caputo, Elisa Ricci

Most deep UDA approaches operate in a single-source, single-target scenario, i. e. they assume that the source and the target samples arise from a single distribution.

Unsupervised Domain Adaptation

Boosting Binary Masks for Multi-Domain Learning through Affine Transformations

no code implementations25 Mar 2021 Massimiliano Mancini, Elisa Ricci, Barbara Caputo, Samuel Rota Buló

In this work, we provide a general formulation of binary mask based models for multi-domain learning by affine transformations of the original network parameters.

Open World Compositional Zero-Shot Learning

2 code implementations CVPR 2021 Massimiliano Mancini, Muhammad Ferjad Naeem, Yongqin Xian, Zeynep Akata

After estimating the feasibility score of each composition, we use these scores to either directly mask the output space or as a margin for the cosine similarity between visual features and compositional embeddings during training.

Compositional Zero-Shot Learning

Towards Recognizing New Semantic Concepts in New Visual Domains

1 code implementation16 Dec 2020 Massimiliano Mancini

In the first part of the thesis, we describe different solutions to enable deep models to generalize to new visual domains, by transferring knowledge from a labeled source domain(s) to a domain (target) where no labeled data are available.

Domain Generalization Multi-Task Learning +1

Prototype-based Incremental Few-Shot Semantic Segmentation

1 code implementation30 Nov 2020 Fabio Cermelli, Massimiliano Mancini, Yongqin Xian, Zeynep Akata, Barbara Caputo

Semantic segmentation models have two fundamental weaknesses: i) they require large training sets with costly pixel-level annotations, and ii) they have a static output space, constrained to the classes of the training set.

Few-Shot Semantic Segmentation Incremental Learning +3

Shape Consistent 2D Keypoint Estimation under Domain Shift

no code implementations4 Aug 2020 Levi O. Vasconcelos, Massimiliano Mancini, Davide Boscaini, Samuel Rota Bulo, Barbara Caputo, Elisa Ricci

Recent unsupervised domain adaptation methods based on deep architectures have shown remarkable performance not only in traditional classification tasks but also in more complex problems involving structured predictions (e. g. semantic segmentation, depth estimation).

Depth Estimation Keypoint Estimation +2

Towards Recognizing Unseen Categories in Unseen Domains

1 code implementation ECCV 2020 Massimiliano Mancini, Zeynep Akata, Elisa Ricci, Barbara Caputo

The key idea of CuMix is to simulate the test-time domain and semantic shift using images and features from unseen domains and categories generated by mixing up the multiple source domains and categories available during training.

Domain Generalization Zero-Shot Learning +1

Boosting Deep Open World Recognition by Clustering

no code implementations20 Apr 2020 Dario Fontanel, Fabio Cermelli, Massimiliano Mancini, Samuel Rota Bulò, Elisa Ricci, Barbara Caputo

While convolutional neural networks have brought significant advances in robot vision, their ability is often limited to closed world scenarios, where the number of semantic concepts to be recognized is determined by the available training set.

Clustering Incremental Learning +1

Modeling the Background for Incremental Learning in Semantic Segmentation

1 code implementation CVPR 2020 Fabio Cermelli, Massimiliano Mancini, Samuel Rota Bulò, Elisa Ricci, Barbara Caputo

Current strategies fail on this task because they do not consider a peculiar aspect of semantic segmentation: since each training step provides annotation only for a subset of all possible classes, pixels of the background class (i. e. pixels that do not belong to any other classes) exhibit a semantic distribution shift.

Continual Learning Disjoint 10-1 +9

Knowledge is Never Enough: Towards Web Aided Deep Open World Recognition

no code implementations4 Jun 2019 Massimiliano Mancini, Hakan Karaoguz, Elisa Ricci, Patric Jensfelt, Barbara Caputo

While today's robots are able to perform sophisticated tasks, they can only act on objects they have been trained to recognize.

Open Set Learning

The RGB-D Triathlon: Towards Agile Visual Toolboxes for Robots

1 code implementation1 Apr 2019 Fabio Cermelli, Massimiliano Mancini, Elisa Ricci, Barbara Caputo

Deep networks have brought significant advances in robot perception, enabling to improve the capabilities of robots in several visual tasks, ranging from object detection and recognition to pose estimation, semantic scene segmentation and many others.

object-detection Object Detection +2

AdaGraph: Unifying Predictive and Continuous Domain Adaptation through Graphs

1 code implementation CVPR 2019 Massimiliano Mancini, Samuel Rota Bulò, Barbara Caputo, Elisa Ricci

The ability to categorize is a cornerstone of visual intelligence, and a key functionality for artificial, autonomous visual machines.

Domain Adaptation

Kitting in the Wild through Online Domain Adaptation

no code implementations3 Jul 2018 Massimiliano Mancini, Hakan Karaoguz, Elisa Ricci, Patric Jensfelt, Barbara Caputo

This novel dataset allows for testing the robustness of robot visual recognition algorithms to a series of different domain shifts both in isolation and unified.

Object Recognition Online Domain Adaptation

Best sources forward: domain generalization through source-specific nets

no code implementations15 Jun 2018 Massimiliano Mancini, Samuel Rota Bulò, Barbara Caputo, Elisa Ricci

A long standing problem in visual object categorization is the ability of algorithms to generalize across different testing conditions.

Domain Generalization Object Categorization

Robust Place Categorization with Deep Domain Generalization

1 code implementation30 May 2018 Massimiliano Mancini, Samuel Rota Bulò, Barbara Caputo, Elisa Ricci

Our method develops from the intuition that, given a set of different classification models associated to known domains (e. g. corresponding to multiple environments, robots), the best model for a new sample in the novel domain can be computed directly at test time by optimally combining the known models.

Domain Generalization General Classification

Boosting Domain Adaptation by Discovering Latent Domains

2 code implementations CVPR 2018 Massimiliano Mancini, Lorenzo Porzi, Samuel Rota Bulò, Barbara Caputo, Elisa Ricci

Our approach is based on the introduction of two main components, which can be embedded into any existing CNN architecture: (i) a side branch that automatically computes the assignment of a source sample to a latent domain and (ii) novel layers that exploit domain membership information to appropriately align the distribution of the CNN internal feature representations to a reference distribution.

Domain Adaptation

Inspiring Computer Vision System Solutions

no code implementations22 Jul 2017 Julian Zilly, Amit Boyarski, Micael Carvalho, Amir Atapour Abarghouei, Konstantinos Amplianitis, Aleksandr Krasnov, Massimiliano Mancini, Hernán Gonzalez, Riccardo Spezialetti, Carlos Sampedro Pérez, Hao Li

Reviewing this project with modern eyes provides us with the opportunity to reflect on several issues, relevant now as then to the field of computer vision and research in general, that go beyond the technical aspects of the work.

Embedding Words and Senses Together via Joint Knowledge-Enhanced Training

no code implementations CONLL 2017 Massimiliano Mancini, Jose Camacho-Collados, Ignacio Iacobacci, Roberto Navigli

Word embeddings are widely used in Natural Language Processing, mainly due to their success in capturing semantic information from massive corpora.

Word Embeddings

Cannot find the paper you are looking for? You can Submit a new open access paper.