Search Results for author: Alberto del Bimbo

Found 78 papers, 31 papers with code

Task-conditioned Domain Adaptation for Pedestrian Detection in Thermal Imagery

1 code implementation ECCV 2020 My Kieu, Andrew D. Bagdanov, Marco Bertini, Alberto del Bimbo

Despite its broad application and interest, it remains a challenging problem in part due to the vast range of conditions under which it must be robust.

Domain Adaptation Pedestrian Detection

Interactive Garment Recommendation with User in the Loop

no code implementations18 Feb 2024 Federico Becattini, Xiaolin Chen, Andrea Puccia, Haokun Wen, Xuemeng Song, Liqiang Nie, Alberto del Bimbo

Recommending fashion items often leverages rich user profiles and makes targeted suggestions based on past history and previous purchases.

reinforcement-learning

Neuromorphic Face Analysis: a Survey

no code implementations18 Feb 2024 Federico Becattini, Lorenzo Berlincioni, Luca Cultrera, Alberto del Bimbo

Neuromorphic sensors, also known as event cameras, are a class of imaging devices mimicking the function of biological visual systems.

Privacy Preserving

Perceptual Quality Improvement in Videoconferencing using Keyframes-based GAN

1 code implementation7 Nov 2023 Lorenzo Agnolucci, Leonardo Galteri, Marco Bertini, Alberto del Bimbo

Given that, in this context, the speaker is typically in front of the camera and remains the same for the entire duration of the transmission, we can maintain a set of reference keyframes of the person from the higher-quality I-frames that are transmitted within the video stream and exploit them to guide the visual quality improvement; a novel aspect of this approach is the update policy that maintains and updates a compact and effective set of reference keyframes.

Video Compression

Deepfake detection by exploiting surface anomalies: the SurFake approach

no code implementations31 Oct 2023 Andrea Ciamarra, Roberto Caldelli, Federico Becattini, Lorenzo Seidenari, Alberto del Bimbo

In particular, when an image (video) is captured the overall geometry of the scene (e. g. surfaces) and the acquisition process (e. g. illumination) determine a univocal environment that is directly represented by the image pixel values; all these intrinsic relations are possibly changed by the deepfake generation process.

DeepFake Detection Face Swapping

Addressing Limitations of State-Aware Imitation Learning for Autonomous Driving

no code implementations31 Oct 2023 Luca Cultrera, Federico Becattini, Lorenzo Seidenari, Pietro Pala, Alberto del Bimbo

We feed the state of the vehicle along with the representation of the environment as a special token of the transformer and propagate it throughout the network.

Autonomous Driving Data Augmentation +2

Reference-based Restoration of Digitized Analog Videotapes

2 code implementations20 Oct 2023 Lorenzo Agnolucci, Leonardo Galteri, Marco Bertini, Alberto del Bimbo

We design a transformer-based Swin-UNet network that exploits both neighboring and reference frames via our Multi-Reference Spatial Feature Fusion (MRSFF) blocks.

Analog Video Restoration Artifact Detection

ARNIQA: Learning Distortion Manifold for Image Quality Assessment

1 code implementation20 Oct 2023 Lorenzo Agnolucci, Leonardo Galteri, Marco Bertini, Alberto del Bimbo

In this work, we propose a self-supervised approach named ARNIQA (leArning distoRtion maNifold for Image Quality Assessment) for modeling the image distortion manifold to obtain quality representations in an intrinsic manner.

Blind Image Quality Assessment No-Reference Image Quality Assessment +1

Mapping Memes to Words for Multimodal Hateful Meme Classification

1 code implementation12 Oct 2023 Giovanni Burbi, Alberto Baldrati, Lorenzo Agnolucci, Marco Bertini, Alberto del Bimbo

Multimodal image-text memes are prevalent on the internet, serving as a unique form of communication that combines visual and textual elements to convey humor, ideas, or emotions.

Hateful Meme Classification Language Modelling

Exploiting CLIP-based Multi-modal Approach for Artwork Classification and Retrieval

no code implementations21 Sep 2023 Alberto Baldrati, Marco Bertini, Tiberio Uricchio, Alberto del Bimbo

Given the recent advances in multimodal image pretraining where visual models trained with semantically dense textual supervision tend to have better generalization capabilities than those trained using categorical attributes or through unsupervised techniques, in this work we investigate how recent CLIP model can be applied in several tasks in artwork domain.

Retrieval Zero-Shot Learning

DiffDefense: Defending against Adversarial Attacks via Diffusion Models

1 code implementation7 Sep 2023 Hondamunige Prasanna Silva, Lorenzo Seidenari, Alberto del Bimbo

This paper presents a novel reconstruction method that leverages Diffusion Models to protect machine learning classifiers against adversarial attacks, all without requiring any modifications to the classifiers themselves.

Adversarial Defense

3D Pose Nowcasting: Forecast the Future to Improve the Present

no code implementations24 Aug 2023 Alessandro Simoni, Francesco Marchetti, Guido Borghi, Federico Becattini, Lorenzo Seidenari, Roberto Vezzani, Alberto del Bimbo

Technologies to enable safe and effective collaboration and coexistence between humans and robots have gained significant importance in the last few years.

Pose Estimation

Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features

1 code implementation22 Aug 2023 Alberto Baldrati, Marco Bertini, Tiberio Uricchio, Alberto del Bimbo

Given a query composed of a reference image and a relative caption, the Composed Image Retrieval goal is to retrieve images visually similar to the reference one that integrates the modifications expressed by the caption.

Contrastive Learning Image Retrieval +1

Diffusion Based Augmentation for Captioning and Retrieval in Cultural Heritage

1 code implementation14 Aug 2023 Dario Cioni, Lorenzo Berlincioni, Federico Becattini, Alberto del Bimbo

Cultural heritage applications and advanced machine learning models are creating a fruitful synergy to provide effective and accessible ways of interacting with artworks.

Image Captioning Retrieval

ECO: Ensembling Context Optimization for Vision-Language Models

no code implementations26 Jul 2023 Lorenzo Agnolucci, Alberto Baldrati, Francesco Todino, Federico Becattini, Marco Bertini, Alberto del Bimbo

Among these, the CLIP model has shown remarkable capabilities for zero-shot transfer by matching an image and a custom textual prompt in its latent space.

Classification Image Classification

4DSR-GCN: 4D Video Point Cloud Upsampling using Graph Convolutional Networks

no code implementations1 Jun 2023 Lorenzo Berlincioni, Stefano Berretti, Marco Bertini, Alberto del Bimbo

Time varying sequences of 3D point clouds, or 4D point clouds, are now being acquired at an increasing pace in several applications (e. g., LiDAR in autonomous or assisted driving).

Edge-computing Graph Attention +1

Transformer-based Graph Neural Networks for Outfit Generation

no code implementations17 Apr 2023 Federico Becattini, Federico Maria Teotini, Alberto del Bimbo

We attempt to bridge the gap between outfit recommendation and generation by leveraging a graph-based representation of items in a collection.

Neuromorphic Event-based Facial Expression Recognition

1 code implementation13 Apr 2023 Lorenzo Berlincioni, Luca Cultrera, Chiara Albisani, Lisa Cresti, Andrea Leonardo, Sara Picchioni, Federico Becattini, Alberto del Bimbo

Recently, event cameras have shown large applicability in several computer vision fields especially concerning tasks that require high temporal resolution.

Emotion Recognition Facial Expression Recognition +1

Parents and Children: Distinguishing Multimodal DeepFakes from Natural Images

1 code implementation2 Apr 2023 Roberto Amoroso, Davide Morelli, Marcella Cornia, Lorenzo Baraldi, Alberto del Bimbo, Rita Cucchiara

Recent advancements in diffusion models have enabled the generation of realistic deepfakes by writing textual prompts in natural language.

Fake Image Detection

Zero-Shot Composed Image Retrieval with Textual Inversion

3 code implementations ICCV 2023 Alberto Baldrati, Lorenzo Agnolucci, Marco Bertini, Alberto del Bimbo

Composed Image Retrieval (CIR) aims to retrieve a target image based on a query composed of a reference image and a relative caption that describes the difference between the two images.

Retrieval Zero-Shot Composed Image Retrieval (ZS-CIR)

Maximally Compact and Separated Features with Regular Polytope Networks

1 code implementation15 Jan 2023 Federico Pernici, Matteo Bruni, Claudio Baecchi, Alberto del Bimbo

Convolutional Neural Networks (CNNs) trained with the Softmax loss are widely used classification models for several vision tasks.

CL2R: Compatible Lifelong Learning Representations

1 code implementation16 Nov 2022 Niccolo Biondi, Federico Pernici, Matteo Bruni, Daniele Mugnai, Alberto del Bimbo

We identify stationarity as the property that the feature representation is required to hold to achieve compatibility and propose a novel training procedure that encourages local and global stationarity on the learned representation.

Representation Learning

Forecasting Future Instance Segmentation with Learned Optical Flow and Warping

no code implementations15 Nov 2022 Andrea Ciamarra, Federico Becattini, Lorenzo Seidenari, Alberto del Bimbo

For an autonomous vehicle it is essential to observe the ongoing dynamics of a scene and consequently predict imminent future scenarios to ensure safety to itself and others.

Instance Segmentation Optical Flow Estimation +1

Learning advisor networks for noisy image classification

1 code implementation ICIAP 2022 Simone Ricci, Tiberio Uricchio, Alberto del Bimbo

In this paper, we introduced the novel concept of advisor network to address the problem of noisy labels in image classification.

Ranked #6 on Image Classification on Clothing1M (using extra training data)

Classification Learning with noisy labels +1

Automatic Estimation of Self-Reported Pain by Trajectory Analysis in the Manifold of Fixed Rank Positive Semi-Definite Matrices

no code implementations5 Sep 2022 Benjamin Szczapa, Mohamed Daoudi, Stefano Berretti, Pietro Pala, Alberto del Bimbo, Zakia Hammal

We compared our method to the state-of-the-art on both datasets using different testing protocols, showing the competitiveness of the proposed approach.

Fashion Recommendation Based on Style and Social Events

1 code implementation1 Aug 2022 Federico Becattini, Lavinia De Divitiis, Claudio Baecchi, Alberto del Bimbo

Overall, we integrate in a state of the art garment recommendation framework a style classifier and an event classifier in order to condition recommendation on a given query.

Generating Multiple 4D Expression Transitions by Learning Face Landmark Trajectories

no code implementations29 Jul 2022 Naima Otberdout, Claudio Ferrari, Mohamed Daoudi, Stefano Berretti, Alberto del Bimbo

We thus propose a new model that generates transitions between different expressions, and synthesizes long and composed 4D expressions.

Is GPT-3 all you need for Visual Question Answering in Cultural Heritage?

no code implementations25 Jul 2022 Pietro Bongini, Federico Becattini, Alberto del Bimbo

The use of Deep Learning and Computer Vision in the Cultural Heritage domain is becoming highly relevant in the last few years with lots of applications about audio smart guides, interactive museums and augmented reality.

Question Answering Visual Question Answering

Online Deep Clustering with Video Track Consistency

no code implementations7 Jun 2022 Alessandra Alfani, Federico Becattini, Lorenzo Seidenari, Alberto del Bimbo

Several unsupervised and self-supervised approaches have been developed in recent years to learn visual features from large-scale unlabeled datasets.

Clustering Deep Clustering +1

Contrastive Supervised Distillation for Continual Representation Learning

1 code implementation11 May 2022 Tommaso Barletti, Niccolo' Biondi, Federico Pernici, Matteo Bruni, Alberto del Bimbo

In this paper, we propose a novel training procedure for the continual representation learning problem in which a neural network model is sequentially learned to alleviate catastrophic forgetting in visual search tasks.

Representation Learning Retrieval

On Modality Bias Recognition and Reduction

1 code implementation25 Feb 2022 Yangyang Guo, Liqiang Nie, Harry Cheng, Zhiyong Cheng, Mohan Kankanhalli, Alberto del Bimbo

From the results on four datasets regarding the above three tasks, our method yields remarkable performance improvements compared with the baselines, demonstrating its superiority on reducing the modality bias problem.

Action Recognition Multi-modal Classification +3

CoReS: Compatible Representations via Stationarity

1 code implementation15 Nov 2021 Niccolo Biondi, Federico Pernici, Matteo Bruni, Alberto del Bimbo

Compatible features enable the direct comparison of old and new learned features allowing to use them interchangeably over time.

Face Recognition

Fine-Grained Adversarial Semi-supervised Learning

no code implementations12 Oct 2021 Daniele Mugnai, Federico Pernici, Francesco Turchini, Alberto del Bimbo

Our approach leverages unlabeled data with an adversarial optimization strategy in which the internal features representation is obtained with a second-order pooling model.

Fine-Grained Visual Categorization

Partially fake it till you make it: mixing real and fake thermal images for improved object detection

no code implementations25 Jun 2021 Francesco Bongini, Lorenzo Berlincioni, Marco Bertini, Alberto del Bimbo

In this paper we propose a novel data augmentation approach for visual content domains that have scarce training datasets, compositing synthetic 3D objects within real scenes.

Data Augmentation object-detection +1

Sparse to Dense Dynamic 3D Facial Expression Generation

no code implementations CVPR 2022 Naima Otberdout, Claudio Ferrari, Mohamed Daoudi, Stefano Berretti, Alberto del Bimbo

This allows us to learn how the motion of a sparse set of landmarks influences the deformation of the overall face surface, independently from the identity.

3D Face Animation Facial expression generation

Learning Group Activities from Skeletons without Individual Action Labels

1 code implementation14 May 2021 Fabio Zappardino, Tiberio Uricchio, Lorenzo Seidenari, Alberto del Bimbo

To understand human behavior we must not just recognize individual actions but model possibly complex group activity and interactions.

Group Activity Recognition

AdaVQA: Overcoming Language Priors with Adapted Margin Cosine Loss

1 code implementation5 May 2021 Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Feng Ji, Ji Zhang, Alberto del Bimbo

Experimental results demonstrate that our adapted margin cosine loss can greatly enhance the baseline models with an absolute performance gain of 15\% on average, strongly verifying the potential of tackling the language prior problem in VQA from the angle of the answer feature space learning.

Question Answering Visual Question Answering

Regular Polytope Networks

1 code implementation IEEE Transactions on Neural Networks and Learning Systems 2021 Federico Pernici, Matteo Bruni, Claudio Baecchi, Alberto del Bimbo

Typically, a learnable transformation (i. e. the classifier) is placed at the end of such models returning a value for each class used for classification.

Robust pedestrian detection in thermal imagery using synthesized images

no code implementations3 Feb 2021 My Kieu, Lorenzo Berlincioni, Leonardo Galteri, Marco Bertini, Andrew D. Bagdanov, Alberto del Bimbo

Experimental results demonstrate the effectiveness of our approach: using less than 50\% of available real thermal training data, and relying on synthesized data generated by our model in the domain adaptation phase, our detector achieves state-of-the-art results on the KAIST Multispectral Pedestrian Detection Benchmark; even if more real thermal data is available adding GAN generated images to the training data results in improved performance, thus showing that these images act as an effective form of data augmentation.

Data Augmentation Domain Adaptation +2

Garment Recommendation with Memory Augmented Neural Networks

no code implementations11 Dec 2020 Lavinia De Divitiis, Federico Becattini, Claudio Baecchi, Alberto del Bimbo

In particular, we aim at retrieving a variety of modalities in which a certain garment can be combined.

Recommendation Systems

Temporal Binary Representation for Event-Based Action Recognition

1 code implementation18 Oct 2020 Simone Undri Innocenti, Federico Becattini, Federico Pernici, Alberto del Bimbo

In this paper we present an event aggregation strategy to convert the output of an event camera into frames processable by traditional Computer Vision algorithms.

Ranked #3 on Gesture Recognition on DVS128 Gesture (using extra training data)

Action Recognition Gesture Recognition

Class-incremental Learning with Pre-allocated Fixed Classifiers

1 code implementation16 Oct 2020 Federico Pernici, Matteo Bruni, Claudio Baecchi, Francesco Turchini, Alberto del Bimbo

Contrarily to the standard expanding classifier, this allows: (a) the output nodes of future unseen classes to firstly see negative samples since the beginning of learning together with the positive samples that incrementally arrive; (b) to learn features that do not change their geometric configuration as novel classes are incorporated in the learning model.

Class Incremental Learning Incremental Learning

Inner Eye Canthus Localization for Human Body Temperature Screening

no code implementations27 Aug 2020 Claudio Ferrari, Lorenzo Berlincioni, Marco Bertini, Alberto del Bimbo

As additional contribution, we enrich the original dataset by using the annotated landmarks to deform and project the 3DMM onto the images.

Face Model

Modelling the Statistics of Cyclic Activities by Trajectory Analysis on the Manifold of Positive-Semi-Definite Matrices

no code implementations24 Jun 2020 Ettore Maria Celozzi, Luca Ciabini, Luca Cultrera, Pietro Pala, Stefano Berretti, Mohamed Daoudi, Alberto del Bimbo

In this paper, a model is presented to extract statistical summaries to characterize the repetition of a cyclic body action, for instance a gym exercise, for the purpose of checking the compliance of the observed action to a template one and highlighting the parts of the action that are not correctly executed (if any).

Explaining Autonomous Driving by Learning End-to-End Visual Attention

no code implementations5 Jun 2020 Luca Cultrera, Lorenzo Seidenari, Federico Becattini, Pietro Pala, Alberto del Bimbo

Current deep learning based autonomous driving approaches yield impressive results also leading to in-production deployment in certain controlled scenarios.

Autonomous Driving Imitation Learning

Image Retrieval using Multi-scale CNN Features Pooling

no code implementations21 Apr 2020 Federico Vaccaro, Marco Bertini, Tiberio Uricchio, Alberto del Bimbo

In this paper, we address the problem of image retrieval by learning images representation based on the activations of a Convolutional Neural Network.

Image Retrieval Retrieval

Visual Question Answering for Cultural Heritage

no code implementations22 Mar 2020 Pietro Bongini, Federico Becattini, Andrew D. Bagdanov, Alberto del Bimbo

This will turn the classic audio guide into a smart personal instructor with which the visitor can interact by asking for explanations focused on specific interests.

Question Answering Visual Question Answering

Text-to-Image Synthesis Based on Machine Generated Captions

no code implementations9 Oct 2019 Marco Menardi, Alex Falcon, Saida S. Mohamed, Lorenzo Seidenari, Giuseppe Serra, Alberto del Bimbo, Carlo Tasso

To address this issue, in this paper we propose an approach capable of generating images starting from a given text using conditional GANs trained on uncaptioned images dataset.

Image Captioning Image Generation

Fix Your Features: Stationary and Maximally Discriminative Embeddings using Regular Polytope (Fixed Classifier) Networks

no code implementations27 Feb 2019 Federico Pernici, Matteo Bruni, Claudio Baecchi, Alberto del Bimbo

Typically, a learnable transformation (i. e. the classifier) is placed at the end of such models returning a value for each class used for classification.

General Classification

Additional Baseline Metrics for the paper "Extended YouTube Faces: a Dataset for Heterogeneous Open-Set Face Identification"

no code implementations11 Feb 2019 Claudio Ferrari, Stefano Berretti, Alberto del Bimbo

In this report, we provide additional and corrected results for the paper "Extended YouTube Faces: a Dataset for Heterogeneous Open-Set Face Identification".

Face Identification

Semantic Road Layout Understanding by Generative Adversarial Inpainting

no code implementations29 May 2018 Lorenzo Berlincioni, Federico Becattini, Leonardo Galteri, Lorenzo Seidenari, Alberto del Bimbo

Autonomous driving is becoming a reality, yet vehicles still need to rely on complex sensor fusion to understand the scene they act in.

Autonomous Driving Segmentation +2

Memory Based Online Learning of Deep Representations from Video Streams

no code implementations CVPR 2018 Federico Pernici, Federico Bartoli, Matteo Bruni, Alberto del Bimbo

It is shown that the proposed learning procedure is asymptotically stable and can be effectively used in relevant applications like multiple face identification and tracking from unconstrained video streams.

Face Identification

Context-Aware Trajectory Prediction

no code implementations6 May 2017 Federico Bartoli, Giuseppe Lisanti, Lamberto Ballan, Alberto del Bimbo

To this end, we propose a "context-aware" recurrent neural network LSTM model, which can learn and predict human motion in crowded spaces such as a sidewalk, a museum or a shopping mall.

Navigate Trajectory Prediction

Deep Generative Adversarial Compression Artifact Removal

no code implementations ICCV 2017 Leonardo Galteri, Lorenzo Seidenari, Marco Bertini, Alberto del Bimbo

Moreover we show that our approach can be used as a pre-processing step for object detection in case images are degraded by compression to a point that state-of-the art detectors fail.

object-detection Object Detection +1

Segmentation Free Object Discovery in Video

no code implementations1 Sep 2016 Giovanni Cuffaro, Federico Becattini, Claudio Baecchi, Lorenzo Seidenari, Alberto del Bimbo

In this paper we present a simple yet effective approach to extend without supervision any object proposal from static images to videos.

Object Object Discovery +1

Automatic Image Annotation via Label Transfer in the Semantic Space

no code implementations16 May 2016 Tiberio Uricchio, Lamberto Ballan, Lorenzo Seidenari, Alberto del Bimbo

Automatic image annotation is among the fundamental problems in computer vision and pattern recognition, and it is becoming increasingly important in order to develop algorithms that are able to search and browse large-scale image collections.

Denoising

Compact Hash Codes for Efficient Visual Descriptors Retrieval in Large Scale Databases

no code implementations10 May 2016 Simone Ercoli, Marco Bertini, Alberto del Bimbo

In this paper we present an efficient method for visual descriptors retrieval based on compact hash codes computed using a multiple k-means assignment.

Retrieval

A Multi-Camera Image Processing and Visualization System for Train Safety Assessment

no code implementations28 Jul 2015 Giuseppe Lisanti, Svebor Karaman, Daniele Pezzatini, Alberto del Bimbo

In this paper we present a machine vision system to efficiently monitor, analyze and present visual data acquired with a railway overhead gantry equipped with multiple cameras.

Representing 3D Texture on Mesh Manifolds for Retrieval and Recognition Applications

no code implementations CVPR 2015 Naoufel Werghi, Claudio Tortorici, Stefano Berretti, Alberto del Bimbo

In this paper, we present and experiment a novel approach for representing texture of 3D mesh manifolds using local binary patterns (LBP).

Face Recognition Retrieval

Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval

1 code implementation28 Mar 2015 Xirong Li, Tiberio Uricchio, Lamberto Ballan, Marco Bertini, Cees G. M. Snoek, Alberto del Bimbo

Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image.

Content-Based Image Retrieval Retrieval +1

A Data-Driven Approach for Tag Refinement and Localization in Web Videos

no code implementations2 Jul 2014 Lamberto Ballan, Marco Bertini, Giuseppe Serra, Alberto del Bimbo

Our approach exploits collective knowledge embedded in user-generated tags and web sources, and visual similarity of keyframes and images uploaded to social sites like YouTube and Flickr, as well as web sources like Google and Bing.

TAG

Cannot find the paper you are looking for? You can Submit a new open access paper.