Search Results for author: Juan Pablo Bello

Found 17 papers, 8 papers with code

Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries

no code implementations • 17 Aug 2023 • Julia Wilkins, Justin Salamon, Magdalena Fuentes, Juan Pablo Bello, Oriol Nieto

We show that our system, trained using our automatic data curation pipeline, significantly outperforms baselines trained on in-the-wild data on the task of HQ SFX retrieval for video.

Contrastive Learning Retrieval

Paper
Add Code

FlowGrad: Using Motion for Visual Sound Source Localization

1 code implementation • 15 Nov 2022 • Rajsuryan Singh, Pablo Zinemanas, Xavier Serra, Juan Pablo Bello, Magdalena Fuentes

Most recent work in visual sound source localization relies on semantic audio-visual representations learned in a self-supervised manner, and by design excludes temporal information present in videos.

Optical Flow Estimation Scene Understanding

Paper
Code

A Study on Robustness to Perturbations for Representations of Environmental Sound

no code implementations • 20 Mar 2022 • Sangeeta Srivastava, Ho-Hsiang Wu, Joao Rulff, Magdalena Fuentes, Mark Cartwright, Claudio Silva, Anish Arora, Juan Pablo Bello

To accomplish this, we imitate channel effects by injecting perturbations to the audio signal and measure the shift in the new (perturbed) embeddings with three distance measures, making the evaluation domain-dependent but not task-dependent.

FAD Transfer Learning

Paper
Add Code

Wav2CLIP: Learning Robust Audio Representations From CLIP

1 code implementation • 21 Oct 2021 • Ho-Hsiang Wu, Prem Seetharaman, Kundan Kumar, Juan Pablo Bello

We propose Wav2CLIP, a robust audio representation learning method by distilling from Contrastive Language-Image Pre-training (CLIP).

Cross-Modal Retrieval Image Generation +3

308

Paper
Code

Soundata: A Python library for reproducible use of audio datasets

no code implementations • 26 Sep 2021 • Magdalena Fuentes, Justin Salamon, Pablo Zinemanas, Martín Rocamora, Genís Paja, Irán R. Román, Marius Miron, Xavier Serra, Juan Pablo Bello

Soundata is a Python library for loading and working with audio datasets in a standardized way, removing the need for writing custom loaders in every project, and improving reproducibility by providing tools to validate data against a canonical version.

Paper
Add Code

Weakly Supervised Source-Specific Sound Level Estimation in Noisy Soundscapes

1 code implementation • 6 May 2021 • Aurora Cramer, Mark Cartwright, Fatemeh Pishdadian, Juan Pablo Bello

While the estimation of what sound sources are, when they occur, and from where they originate has been well-studied, the estimation of how loud these sound sources are has been often overlooked.

Paper
Code

Multi-Task Self-Supervised Pre-Training for Music Classification

no code implementations • 5 Feb 2021 • Ho-Hsiang Wu, Chieh-Chi Kao, Qingming Tang, Ming Sun, Brian McFee, Juan Pablo Bello, Chao Wang

Deep learning is very data hungry, and supervised learning especially requires massive labeled data to work well.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +7

Paper
Add Code

SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context

no code implementations • 11 Sep 2020 • Mark Cartwright, Jason Cramer, Ana Elisa Mendez Mendez, Yu Wang, Ho-Hsiang Wu, Vincent Lostanlen, Magdalena Fuentes, Graham Dove, Charlie Mydlarz, Justin Salamon, Oded Nov, Juan Pablo Bello

In this article, we describe our data collection procedure and propose evaluation metrics for multilabel classification of urban sound tags.

Paper
Add Code

One or Two Components? The Scattering Transform Answers

no code implementations • 2 Mar 2020 • Vincent Lostanlen, Alice Cohen-Hadria, Juan Pablo Bello

With the aim of constructing a biologically plausible model of machine listening, we study the representation of a multicomponent stationary signal by a wavelet scattering network.

Vocal Bursts Valence Prediction

Paper
Add Code

Long-distance Detection of Bioacoustic Events with Per-channel Energy Normalization

no code implementations • 1 Nov 2019 • Vincent Lostanlen, Kaitlin Palmer, Elly Knight, Christopher Clark, Holger Klinck, Andrew Farnsworth, Tina Wong, Jason Cramer, Juan Pablo Bello

This paper proposes to perform unsupervised detection of bioacoustic events by pooling the magnitudes of spectrogram frames after per-channel energy normalization (PCEN).

Noise Estimation speech-recognition +1

Paper
Add Code

Learning the helix topology of musical pitch

1 code implementation • 22 Oct 2019 • Vincent Lostanlen, Sripathi Sridhar, Brian McFee, Andrew Farnsworth, Juan Pablo Bello

To explain the consonance of octaves, music psychologists represent pitch as a helix where azimuth and axial coordinate correspond to pitch class and pitch height respectively.

Paper
Code

Adversarial Learning for Improved Onsets and Frames Music Transcription

no code implementations • 20 Jun 2019 • Jong Wook Kim, Juan Pablo Bello

Automatic music transcription is considered to be one of the hardest problems in music information retrieval, yet recent deep learning approaches have achieved substantial improvements on transcription performance.

Information Retrieval Music Information Retrieval +2

Paper
Add Code

Robust sound event detection in bioacoustic sensor networks

1 code implementation • 20 May 2019 • Vincent Lostanlen, Justin Salamon, Andrew Farnsworth, Steve Kelling, Juan Pablo Bello

As a case study, we consider the problem of detecting avian flight calls from a ten-hour recording of nocturnal bird migration, recorded by a network of six ARUs in the presence of heterogeneous background noise.

Data Augmentation Event Detection +1

Paper
Code

Neural Music Synthesis for Flexible Timbre Control

no code implementations • 1 Nov 2018 • Jong Wook Kim, Rachel Bittner, Aparna Kumar, Juan Pablo Bello

The recent success of raw audio waveform synthesis models like WaveNet motivates a new approach for music synthesis, in which the entire process --- creating audio samples from a score and instrument information --- is modeled using generative neural networks.

Paper
Add Code

Adaptive pooling operators for weakly labeled sound event detection

2 code implementations • 26 Apr 2018 • Brian McFee, Justin Salamon, Juan Pablo Bello

In this work, we treat SED as a multiple instance learning (MIL) problem, where training labels are static over a short excerpt, indicating the presence or absence of sound sources but not their temporal locality.

Event Detection Multiple Instance Learning +2

Paper
Code

CREPE: A Convolutional Representation for Pitch Estimation

1 code implementation • 17 Feb 2018 • Jong Wook Kim, Justin Salamon, Peter Li, Juan Pablo Bello

To date, the best performing techniques, such as the pYIN algorithm, are based on a combination of DSP pipelines and heuristics.

Information Retrieval Music Information Retrieval +1

1,039

Paper
Code

Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification

5 code implementations • 15 Aug 2016 • Justin Salamon, Juan Pablo Bello

We show that the improved performance stems from the combination of a deep, high-capacity model and an augmented training set: this combination outperforms both the proposed CNN without augmentation and a "shallow" dictionary learning model with augmentation.

Ranked #6 on Environmental Sound Classification on UrbanSound8K (using extra training data)

Data Augmentation Dictionary Learning +3

4,291

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.