Search Results for author: Michel Olvera

Found 5 papers, 1 papers with code

TACO: Training-free Sound Prompted Segmentation via Deep Audio-visual CO-factorization

no code implementations2 Dec 2024 Hugo Malard, Michel Olvera, Stephane Lathuiliere, Slim Essid

Large-scale pre-trained audio and image models demonstrate an unprecedented degree of generalization, making them suitable for a wide range of applications.

An Eye for an Ear: Zero-shot Audio Description Leveraging an Image Captioner using Audiovisual Distribution Alignment

1 code implementation8 Oct 2024 Hugo Malard, Michel Olvera, Stéphane Lathuiliere, Slim Essid

In this work, we introduce a novel methodology for bridging the audiovisual modality gap by matching the distributions of tokens produced by an audio backbone and those of an image captioner.

Contrastive Learning Image Captioning +2

A sound description: Exploring prompt templates and class descriptions to enhance zero-shot audio classification

no code implementations19 Sep 2024 Michel Olvera, Paraskevas Stamatiadis, Slim Essid

First, we find that the formatting of the prompts significantly affects performance so that simply prompting the models with properly formatted class labels performs competitively with optimized prompt templates and even prompt ensembling.

Audio Classification Contrastive Learning +2

Foreground-Background Ambient Sound Scene Separation

no code implementations11 May 2020 Michel Olvera, Emmanuel Vincent, Romain Serizel, Gilles Gasso

Ambient sound scenes typically comprise multiple short events occurring on top of a somewhat stationary background.

Cannot find the paper you are looking for? You can Submit a new open access paper.