Search Results for author: Bryan Pardo

Found 24 papers, 11 papers with code

High-Fidelity Neural Phonetic Posteriorgrams

1 code implementation27 Feb 2024 Cameron Churchwell, Max Morrison, Bryan Pardo

A phonetic posteriorgram (PPG) is a time-varying categorical distribution over acoustic units of speech (e. g., phonemes).

Voice Conversion

Crowdsourced and Automatic Speech Prominence Estimation

1 code implementation12 Oct 2023 Max Morrison, Pranav Pawar, Nathan Pruyne, Jennifer Cole, Bryan Pardo

Speech prominence estimation is the process of assigning a numeric value to the prominence of each word in an utterance.

Emotion Recognition

VampNet: Music Generation via Masked Acoustic Token Modeling

1 code implementation10 Jul 2023 Hugo Flores Garcia, Prem Seetharaman, Rithesh Kumar, Bryan Pardo

We introduce VampNet, a masked acoustic token modeling approach to music synthesis, compression, inpainting, and variation.

Music Compression Music Generation

Music Separation Enhancement with Generative Modeling

no code implementations26 Aug 2022 Noah Schaffer, Boaz Cogan, Ethan Manilow, Max Morrison, Prem Seetharaman, Bryan Pardo

Despite phenomenal progress in recent years, state-of-the-art music separation systems produce source estimates with significant perceptual shortcomings, such as adding extraneous noise or removing harmonics.

Music Source Separation

Reproducible Subjective Evaluation

1 code implementation8 Mar 2022 Max Morrison, Brian Tang, Gefei Tan, Bryan Pardo

ReSEval lets researchers launch A/B, ABX, Mean Opinion Score (MOS) and MUltiple Stimuli with Hidden Reference and Anchor (MUSHRA) tests on audio, image, text, or video data from a command-line interface or using one line of Python, making it as easy to run as objective evaluation.

Deep Learning Tools for Audacity: Helping Researchers Expand the Artist's Toolkit

no code implementations25 Oct 2021 Hugo Flores Garcia, Aldo Aguilar, Ethan Manilow, Dmitry Vedenko, Bryan Pardo

We present a software framework that integrates neural networks into the popular open-source audio editing software, Audacity, with a minimal amount of developer effort.

Unsupervised Source Separation By Steering Pretrained Music Models

1 code implementation25 Oct 2021 Ethan Manilow, Patrick O'Reilly, Prem Seetharaman, Bryan Pardo

We showcase an unsupervised method that repurposes deep models trained for music generation and music tagging for audio source separation, without any retraining.

Audio Generation Audio Source Separation +3

Neural Pitch-Shifting and Time-Stretching with Controllable LPCNet

1 code implementation5 Oct 2021 Max Morrison, Zeyu Jin, Nicholas J. Bryan, Juan-Pablo Caceres, Bryan Pardo

Modifying the pitch and timing of an audio signal are fundamental audio editing operations with applications in speech manipulation, audio-visual synchronization, and singing voice editing and synthesis.

Audio-Visual Synchronization

Leveraging Hierarchical Structures for Few-Shot Musical Instrument Recognition

1 code implementation14 Jul 2021 Hugo Flores Garcia, Aldo Aguilar, Ethan Manilow, Bryan Pardo

Deep learning work on musical instrument recognition has generally focused on instrument classes for which we have abundant data.

Few-Shot Learning Instrument Recognition

Context-Aware Prosody Correction for Text-Based Speech Editing

no code implementations16 Feb 2021 Max Morrison, Lucas Rencker, Zeyu Jin, Nicholas J. Bryan, Juan-Pablo Caceres, Bryan Pardo

Text-based speech editors expedite the process of editing speech recordings by permitting editing via intuitive cut, copy, and paste operations on a speech transcript.


A Study of Transfer Learning in Music Source Separation

no code implementations23 Oct 2020 Andreas Bugler, Bryan Pardo, Prem Seetharaman

Supervised deep learning methods for performing audio source separation can be very effective in domains where there is a large amount of training data.

Audio Source Separation Data Augmentation +2

Bespoke Neural Networks for Score-Informed Source Separation

no code implementations29 Sep 2020 Ethan Manilow, Bryan Pardo

In this paper, we introduce a simple method that can separate arbitrary musical instruments from an audio mixture.

AutoClip: Adaptive Gradient Clipping for Source Separation Networks

1 code implementation25 Jul 2020 Prem Seetharaman, Gordon Wichern, Bryan Pardo, Jonathan Le Roux

Clipping the gradient is a known approach to improving gradient descent, but requires hand selection of a clipping threshold hyperparameter.

Audio Source Separation

Bach or Mock? A Grading Function for Chorales in the Style of J.S. Bach

1 code implementation23 Jun 2020 Alexander Fang, Alisa Liu, Prem Seetharaman, Bryan Pardo

Deep generative systems that learn probabilistic models from a corpus of existing music do not explicitly encode knowledge of a musical style, compared to traditional rule-based systems.

Incorporating Music Knowledge in Continual Dataset Augmentation for Music Generation

1 code implementation23 Jun 2020 Alisa Liu, Alexander Fang, Gaëtan Hadjeres, Prem Seetharaman, Bryan Pardo

In this paper, we present augmentative generation (Aug-Gen), a method of dataset augmentation for any music generation system trained on a resource-constrained domain.

Music Generation

OtoMechanic: Auditory Automobile Diagnostics via Query-by-Example

no code implementations5 Nov 2019 Max Morrison, Bryan Pardo

Many automobile components in need of repair produce characteristic sounds.

Bootstrapping deep music separation from primitive auditory grouping principles

no code implementations23 Oct 2019 Prem Seetharaman, Gordon Wichern, Jonathan Le Roux, Bryan Pardo

They are trained on synthetic mixtures of audio made from isolated sound source recordings so that ground truth for the separation is known.

Music Source Separation

Model selection for deep audio source separation via clustering analysis

no code implementations23 Oct 2019 Alisa Liu, Prem Seetharaman, Bryan Pardo

We compare our confidence-based ensemble approach to using individual models with no selection, to an oracle that always selects the best model and to a random model selector.

Audio Source Separation Clustering +1

Simultaneous Separation and Transcription of Mixtures with Multiple Polyphonic and Percussive Instruments

no code implementations22 Oct 2019 Ethan Manilow, Prem Seetharaman, Bryan Pardo

We present a single deep learning architecture that can both separate an audio recording of a musical mixture into constituent single-instrument recordings and transcribe these instruments into a human-readable format at the same time, learning a shared musical representation for both tasks.

Bootstrapping single-channel source separation via unsupervised spatial clustering on stereo mixtures

no code implementations6 Nov 2018 Prem Seetharaman, Gordon Wichern, Jonathan Le Roux, Bryan Pardo

These estimates, together with a weighting scheme in the time-frequency domain, based on confidence in the separation quality, are used to train a deep learning model that can be used for single-channel separation, where no source direction information is available.

Clustering Image Segmentation +2

An Overview of Lead and Accompaniment Separation in Music

no code implementations23 Apr 2018 Zafar Rafii, Antoine Liutkus, Fabian-Robert Stöter, Stylianos Ioannis Mimilakis, Derry FitzGerald, Bryan Pardo

For model-based methods, we organize them according to whether they concentrate on the lead signal, the accompaniment, or both.

Sound Audio and Speech Processing

Cannot find the paper you are looking for? You can Submit a new open access paper.