Search Results for author: Xavier Favory

Found 10 papers, 7 papers with code

Evaluating Off-the-Shelf Machine Listening and Natural Language Models for Automated Audio Captioning

no code implementations14 Oct 2021 Benno Weck, Xavier Favory, Konstantinos Drossos, Xavier Serra

Having attracted attention only recently, very few works on AAC study the performance of existing pre-trained audio and natural language processing resources.

Audio captioning Word Embeddings

Enriched Music Representations with Multiple Cross-modal Contrastive Learning

1 code implementation1 Apr 2021 Andres Ferraro, Xavier Favory, Konstantinos Drossos, Yuntae Kim, Dmitry Bogdanov

Modeling various aspects that make a music piece unique is a challenging task, requiring the combination of multiple sources of information.

Contrastive Learning Genre classification

Learning Contextual Tag Embeddings for Cross-Modal Alignment of Audio and Tags

1 code implementation27 Oct 2020 Xavier Favory, Konstantinos Drossos, Tuomas Virtanen, Xavier Serra

In this work we propose a method for learning audio representations using an audio autoencoder (AAE), a general word embeddings model (WEM), and a multi-head self-attention (MHA) mechanism.

cross-modal alignment Representation Learning +2

FSD50K: An Open Dataset of Human-Labeled Sound Events

8 code implementations1 Oct 2020 Eduardo Fonseca, Xavier Favory, Jordi Pons, Frederic Font, Xavier Serra

Most existing datasets for sound event recognition (SER) are relatively small and/or domain-specific, with the exception of AudioSet, based on over 2M tracks from YouTube videos and encompassing over 500 sound classes.

COALA: Co-Aligned Autoencoders for Learning Semantically Enriched Audio Representations

2 code implementations15 Jun 2020 Xavier Favory, Konstantinos Drossos, Tuomas Virtanen, Xavier Serra

Audio representation learning based on deep neural networks (DNNs) emerged as an alternative approach to hand-crafted features.

Representation Learning

Search Result Clustering in Collaborative Sound Collections

no code implementations8 Apr 2020 Xavier Favory, Frederic Font, Xavier Serra

In our work, we propose a graph-based approach using audio features for clustering diverse sound collections obtained when querying large online databases.


Neural Percussive Synthesis Parameterised by High-Level Timbral Features

1 code implementation25 Nov 2019 António Ramires, Pritish Chandna, Xavier Favory, Emilia Gómez, Xavier Serra

We present a deep neural network-based methodology for synthesising percussive sounds with control over high-level timbral characteristics of the sounds.

Vocal Bursts Intensity Prediction

Learning Sound Event Classifiers from Web Audio with Noisy Labels

2 code implementations4 Jan 2019 Eduardo Fonseca, Manoj Plakal, Daniel P. W. Ellis, Frederic Font, Xavier Favory, Xavier Serra

To foster the investigation of label noise in sound event classification we present FSDnoisy18k, a dataset containing 42. 5 hours of audio across 20 sound classes, including a small amount of manually-labeled data and a larger quantity of real-world noisy data.

General Classification Sound Event Detection

Facilitating the Manual Annotation of Sounds When Using Large Taxonomies

no code implementations21 Nov 2018 Xavier Favory, Eduardo Fonseca, Frederic Font, Xavier Serra

It enables, for instance, the development of automatic tools for the annotation of large and diverse multimedia collections.

Information Retrieval Retrieval

General-purpose Tagging of Freesound Audio with AudioSet Labels: Task Description, Dataset, and Baseline

3 code implementations26 Jul 2018 Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis, Xavier Favory, Jordi Pons, Xavier Serra

The goal of the task is to build an audio tagging system that can recognize the category of an audio clip from a subset of 41 diverse categories drawn from the AudioSet Ontology.

Audio Tagging Task 2

Cannot find the paper you are looking for? You can Submit a new open access paper.