2 code implementations • 21 Mar 2024 • Alberto Baldrati, Davide Morelli, Marcella Cornia, Marco Bertini, Rita Cucchiara
Fashion illustration is a crucial medium for designers to convey their creative vision and transform design concepts into tangible representations that showcase the interplay between clothing and the human body.
1 code implementation • 12 Oct 2023 • Giovanni Burbi, Alberto Baldrati, Lorenzo Agnolucci, Marco Bertini, Alberto del Bimbo
Multimodal image-text memes are prevalent on the internet, serving as a unique form of communication that combines visual and textual elements to convey humor, ideas, or emotions.
Ranked #1 on Hateful Meme Classification on HarMeme
no code implementations • 21 Sep 2023 • Alberto Baldrati, Marco Bertini, Tiberio Uricchio, Alberto del Bimbo
Given the recent advances in multimodal image pretraining where visual models trained with semantically dense textual supervision tend to have better generalization capabilities than those trained using categorical attributes or through unsupervised techniques, in this work we investigate how recent CLIP model can be applied in several tasks in artwork domain.
1 code implementation • 11 Sep 2023 • Giuseppe Cartella, Alberto Baldrati, Davide Morelli, Marcella Cornia, Marco Bertini, Rita Cucchiara
The inexorable growth of online shopping and e-commerce demands scalable and robust machine learning-based solutions to accommodate customer requirements.
1 code implementation • 22 Aug 2023 • Alberto Baldrati, Marco Bertini, Tiberio Uricchio, Alberto del Bimbo
Given a query composed of a reference image and a relative caption, the Composed Image Retrieval goal is to retrieve images visually similar to the reference one that integrates the modifications expressed by the caption.
Ranked #6 on Image Retrieval on CIRR
no code implementations • 26 Jul 2023 • Lorenzo Agnolucci, Alberto Baldrati, Francesco Todino, Federico Becattini, Marco Bertini, Alberto del Bimbo
Among these, the CLIP model has shown remarkable capabilities for zero-shot transfer by matching an image and a custom textual prompt in its latent space.
1 code implementation • 22 May 2023 • Davide Morelli, Alberto Baldrati, Giuseppe Cartella, Marcella Cornia, Marco Bertini, Rita Cucchiara
In this context, image-based virtual try-on, which consists in generating a novel image of a target model wearing a given in-shop garment, has yet to capitalize on the potential of these powerful generative solutions.
1 code implementation • ICCV 2023 • Alberto Baldrati, Davide Morelli, Giuseppe Cartella, Marcella Cornia, Marco Bertini, Rita Cucchiara
Given the lack of existing datasets suitable for the task, we also extend two existing fashion datasets, namely Dress Code and VITON-HD, with multimodal annotations collected in a semi-automatic manner.
3 code implementations • ICCV 2023 • Alberto Baldrati, Lorenzo Agnolucci, Marco Bertini, Alberto del Bimbo
Composed Image Retrieval (CIR) aims to retrieve a target image based on a query composed of a reference image and a relative caption that describes the difference between the two images.
2 code implementations • CVPRW 2022 • Alberto Baldrati, Marco Bertini, Tiberio Uricchio, Alberto del Bimbo
The proposed method is based on an initial training stage where a simple combination of visual and textual features is used, to fine-tune the CLIP text encoder.
Ranked #3 on Image Retrieval on LaSCo
Composed Image Retrieval (CoIR) Content-Based Image Retrieval +2
2 code implementations • CVPR 2022 • Alberto Baldrati, Marco Bertini, Tiberio Uricchio, Alberto del Bimbo
the visual content of the query image.
Ranked #9 on Image Retrieval on CIRR