no code implementations • 10 Oct 2024 • Benno Weck, Frederic Font
These insights are crucial for developing user-centred, effective text-based audio retrieval systems, enhancing our understanding of user behaviour in sound search contexts.
1 code implementation • 1 Oct 2024 • Panagiota Anastasopoulou, Jessica Torrey, Xavier Serra, Frederic Font
We compare a variety of both traditional and modern machine learning approaches to establish a baseline for the task of heterogeneous sound classification.
1 code implementation • 8 Sep 2024 • Francesco Papaleo, Xavier Lizarraga-Seijas, Frederic Font
Reverberation is a key element in spatial audio perception, historically achieved with the use of analogue devices, such as plate and spring reverb, and in the last decades with digital signal processing techniques that have allowed different approaches for Virtual Analogue Modelling (VAM).
8 code implementations • 1 Oct 2020 • Eduardo Fonseca, Xavier Favory, Jordi Pons, Frederic Font, Xavier Serra
Most existing datasets for sound event recognition (SER) are relatively small and/or domain-specific, with the exception of AudioSet, based on over 2M tracks from YouTube videos and encompassing over 500 sound classes.
1 code implementation • 26 Aug 2020 • Antonio Ramires, Frederic Font, Dmitry Bogdanov, Jordan B. L. Smith, Yi-Hsuan Yang, Joann Ching, Bo-Yu Chen, Yueh-Kao Wu, Hsu Wei-Han, Xavier Serra
We present the Freesound Loop Dataset (FSLD), a new large-scale dataset of music loops annotated by experts.
Audio and Speech Processing Sound
no code implementations • 8 Apr 2020 • Xavier Favory, Frederic Font, Xavier Serra
In our work, we propose a graph-based approach using audio features for clustering diverse sound collections obtained when querying large online databases.
1 code implementation • 26 Oct 2019 • Eduardo Fonseca, Frederic Font, Xavier Serra
We show that these simple methods can be effective in mitigating the effect of label noise, providing up to 2. 5\% of accuracy boost when incorporated to two different CNNs, while requiring minimal intervention and computational overhead.
2 code implementations • 7 Jun 2019 • Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis, Xavier Serra
The task evaluates systems for multi-label audio tagging using a large set of noisy-labeled data, and a much smaller set of manually-labeled data, under a large vocabulary setting of 80 everyday sound classes.
2 code implementations • 4 Jan 2019 • Eduardo Fonseca, Manoj Plakal, Daniel P. W. Ellis, Frederic Font, Xavier Favory, Xavier Serra
To foster the investigation of label noise in sound event classification we present FSDnoisy18k, a dataset containing 42. 5 hours of audio across 20 sound classes, including a small amount of manually-labeled data and a larger quantity of real-world noisy data.
no code implementations • 21 Nov 2018 • Xavier Favory, Eduardo Fonseca, Frederic Font, Xavier Serra
It enables, for instance, the development of automatic tools for the annotation of large and diverse multimedia collections.
3 code implementations • 26 Jul 2018 • Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis, Xavier Favory, Jordi Pons, Xavier Serra
The goal of the task is to build an audio tagging system that can recognize the category of an audio clip from a subset of 41 diverse categories drawn from the AudioSet Ontology.