no code implementations • ECCV 2020 • Yanbei Chen, Loris Bazzani
Interactive image retrieval is an emerging research topic with the objective of integrating inputs from multiple modalities as query for retrieval, e. g., textual feedback from users to guide, modify or refine image retrieval.
1 code implementation • 29 Feb 2024 • Xianghui Yang, Yan Zuo, Sameera Ramasinghe, Loris Bazzani, Gil Avraham, Anton Van Den Hengel
Novel-view synthesis through diffusion models has demonstrated remarkable potential for generating diverse and high-quality images.
no code implementations • 10 May 2023 • Rumeysa Bodur, Erhan Gundogdu, Binod Bhattarai, Tae-Kyun Kim, Michael Donoser, Loris Bazzani
We propose a novel learning method for text-guided image editing, namely \texttt{iEdit}, that generates images conditioned on a source image and a textual edit prompt.
no code implementations • 26 Apr 2022 • Mengmeng Xu, Erhan Gundogdu, Maksim Lapin, Bernard Ghanem, Michael Donoser, Loris Bazzani
Long-form video understanding requires designing approaches that are able to temporally localize activities or language.
Contrastive Learning Few Shot Temporal Action Localization +3
1 code implementation • CVPR 2021 • Amaia Salvador, Erhan Gundogdu, Loris Bazzani, Michael Donoser
Cross-modal recipe retrieval has recently gained substantial attention due to the importance of food in people's lives, as well as the availability of vast amounts of digital cooking recipes and food images to train machine learning models.
Ranked #6 on Cross-Modal Retrieval on Recipe1M
1 code implementation • ICCV 2021 • Yuxin Hou, Eleonora Vig, Michael Donoser, Loris Bazzani
Interactive retrieval for online fashion shopping provides the ability of changing image retrieval results according to the user feedback.
1 code implementation • 9 Oct 2018 • Loris Bazzani, Tobias Domhan, Felix Hieber
Image captioning is an interdisciplinary research problem that stands between computer vision and natural language processing.
no code implementations • 29 Sep 2016 • Hà Quang Minh, Marco San Biagio, Loris Bazzani, Vittorio Murino
This paper presents a novel framework for visual object recognition using infinite-dimensional covariance operators of input features in the paradigm of kernel methods on infinite-dimensional Riemannian manifolds.
no code implementations • CVPR 2016 • Ha Quang Minh, Marco San Biagio, Loris Bazzani, Vittorio Murino
This paper presents a novel framework for visual object recognition using infinite-dimensional covariance operators of input features, in the paradigm of kernel methods on infinite-dimensional Riemannian manifolds.
no code implementations • 27 Mar 2016 • Loris Bazzani, Hugo Larochelle, Lorenzo Torresani
In this work, we propose a spatiotemporal attentional model that learns where to look in a video directly from human fixation data.
no code implementations • 13 Sep 2014 • Loris Bazzani, Alessandro Bergamo, Dragomir Anguelov, Lorenzo Torresani
This paper introduces self-taught object localization, a novel approach that leverages deep convolutional networks trained for whole-image recognition to localize objects in images without additional human supervision, i. e., without using any ground-truth bounding boxes for training.
no code implementations • 31 Jan 2014 • Ha Quang Minh, Loris Bazzani, Vittorio Murino
This paper presents a general vector-valued reproducing kernel Hilbert spaces (RKHS) framework for the problem of learning an unknown functional dependency between a structured input space and a structured output space.