Search Results for author: Loris Bazzani

Found 12 papers, 4 papers with code

Learning Joint Visual Semantic Matching Embeddings for Language-guided Retrieval

no code implementations • ECCV 2020 • Yanbei Chen, Loris Bazzani

Interactive image retrieval is an emerging research topic with the objective of integrating inputs from multiple modalities as query for retrieval, e. g., textual feedback from users to guide, modify or refine image retrieval.

Image Retrieval Retrieval +2

Paper
Add Code

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

1 code implementation • 29 Feb 2024 • Xianghui Yang, Yan Zuo, Sameera Ramasinghe, Loris Bazzani, Gil Avraham, Anton Van Den Hengel

Novel-view synthesis through diffusion models has demonstrated remarkable potential for generating diverse and high-quality images.

Denoising Image Generation +1

Paper
Code

iEdit: Localised Text-guided Image Editing with Weak Supervision

no code implementations • 10 May 2023 • Rumeysa Bodur, Erhan Gundogdu, Binod Bhattarai, Tae-Kyun Kim, Michael Donoser, Loris Bazzani

We propose a novel learning method for text-guided image editing, namely \texttt{iEdit}, that generates images conditioned on a source image and a textual edit prompt.

Contrastive Learning Descriptive +1

Paper
Add Code

Contrastive Language-Action Pre-training for Temporal Localization

no code implementations • 26 Apr 2022 • Mengmeng Xu, Erhan Gundogdu, Maksim Lapin, Bernard Ghanem, Michael Donoser, Loris Bazzani

Long-form video understanding requires designing approaches that are able to temporally localize activities or language.

Contrastive Learning Few Shot Temporal Action Localization +3

Paper
Add Code

Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

1 code implementation • CVPR 2021 • Amaia Salvador, Erhan Gundogdu, Loris Bazzani, Michael Donoser

Cross-modal recipe retrieval has recently gained substantial attention due to the importance of food in people's lives, as well as the availability of vast amounts of digital cooking recipes and food images to train machine learning models.

Ranked #6 on Cross-Modal Retrieval on Recipe1M

Cross-Modal Retrieval Retrieval +1

Paper
Code

Learning Attribute-Driven Disentangled Representations for Interactive Fashion Retrieval

1 code implementation • ICCV 2021 • Yuxin Hou, Eleonora Vig, Michael Donoser, Loris Bazzani

Interactive retrieval for online fashion shopping provides the ability of changing image retrieval results according to the user feedback.

Attribute Disentanglement +2

Paper
Code

Image Captioning as Neural Machine Translation Task in SOCKEYE

1 code implementation • 9 Oct 2018 • Loris Bazzani, Tobias Domhan, Felix Hieber

Image captioning is an interdisciplinary research problem that stands between computer vision and natural language processing.

Image Captioning Machine Translation +2

1,206

Paper
Code

Kernel Methods on Approximate Infinite-Dimensional Covariance Operators for Image Classification

no code implementations • 29 Sep 2016 • Hà Quang Minh, Marco San Biagio, Loris Bazzani, Vittorio Murino

This paper presents a novel framework for visual object recognition using infinite-dimensional covariance operators of input features in the paradigm of kernel methods on infinite-dimensional Riemannian manifolds.

General Classification Image Classification +1

Paper
Add Code

Approximate Log-Hilbert-Schmidt Distances Between Covariance Operators for Image Classification

no code implementations • CVPR 2016 • Ha Quang Minh, Marco San Biagio, Loris Bazzani, Vittorio Murino

This paper presents a novel framework for visual object recognition using infinite-dimensional covariance operators of input features, in the paradigm of kernel methods on infinite-dimensional Riemannian manifolds.

General Classification Image Classification +1

Paper
Add Code

Recurrent Mixture Density Network for Spatiotemporal Visual Attention

no code implementations • 27 Mar 2016 • Loris Bazzani, Hugo Larochelle, Lorenzo Torresani

In this work, we propose a spatiotemporal attentional model that learns where to look in a video directly from human fixation data.

Action Classification Saliency Prediction

Paper
Add Code

Self-taught Object Localization with Deep Networks

no code implementations • 13 Sep 2014 • Loris Bazzani, Alessandro Bergamo, Dragomir Anguelov, Lorenzo Torresani

This paper introduces self-taught object localization, a novel approach that leverages deep convolutional networks trained for whole-image recognition to localize objects in images without additional human supervision, i. e., without using any ground-truth bounding boxes for training.

Clustering Object +1

Paper
Add Code

A Unifying Framework in Vector-valued Reproducing Kernel Hilbert Spaces for Manifold Regularization and Co-Regularized Multi-view Learning

no code implementations • 31 Jan 2014 • Ha Quang Minh, Loris Bazzani, Vittorio Murino

This paper presents a general vector-valued reproducing kernel Hilbert spaces (RKHS) framework for the problem of learning an unknown functional dependency between a structured input space and a structured output space.

MULTI-VIEW LEARNING Object Recognition

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.