Search Results for author: Alex Falcon

Found 10 papers, 5 papers with code

A Language-based solution to enable Metaverse Retrieval

1 code implementation22 Dec 2023 Ali Abdari, Alex Falcon, Giuseppe Serra

Recently, the Metaverse is becoming increasingly attractive, with millions of users accessing the many available virtual worlds.

Contrastive Learning Retrieval

UniUD Submission to the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge 2023

no code implementations27 Jun 2023 Alex Falcon, Giuseppe Serra

In this report, we present the technical details of our submission to the EPIC-Kitchens-100 Multi-Instance Retrieval Challenge 2023.

Multi-Instance Retrieval Retrieval

A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval

1 code implementation3 Aug 2022 Alex Falcon, Giuseppe Serra, Oswald Lanz

Data augmentation techniques were introduced to increase the performance on unseen test examples by creating new training samples with the application of semantics-preserving techniques, such as color space or geometric transformations on images.

Data Augmentation Retrieval +1

Relevance-based Margin for Contrastively-trained Video Retrieval Models

1 code implementation27 Apr 2022 Alex Falcon, Swathikiran Sudhakaran, Giuseppe Serra, Sergio Escalera, Oswald Lanz

We show that even if we carefully tuned the fixed margin, our technique (which does not have the margin as a hyper-parameter) would still achieve better performance.

Multi-Instance Retrieval Natural Language Queries +2

Learning video retrieval models with relevance-aware online mining

2 code implementations16 Mar 2022 Alex Falcon, Giuseppe Serra, Oswald Lanz

Due to the amount of videos and related captions uploaded every hour, deep learning-based solutions for cross-modal video retrieval are attracting more and more attention.

Multi-Instance Retrieval Retrieval +2

Data augmentation techniques for the Video Question Answering task

no code implementations22 Aug 2020 Alex Falcon, Oswald Lanz, Giuseppe Serra

Video Question Answering (VideoQA) is a task that requires a model to analyze and understand both the visual content given by the input video and the textual part given by the question, and the interaction between them in order to produce a meaningful answer.

Data Augmentation Question Answering +1

Text-to-Image Synthesis Based on Machine Generated Captions

no code implementations9 Oct 2019 Marco Menardi, Alex Falcon, Saida S. Mohamed, Lorenzo Seidenari, Giuseppe Serra, Alberto del Bimbo, Carlo Tasso

To address this issue, in this paper we propose an approach capable of generating images starting from a given text using conditional GANs trained on uncaptioned images dataset.

Image Captioning Image Generation

Cannot find the paper you are looking for? You can Submit a new open access paper.