no code implementations • 30 Jan 2023 • Chris Donahue, Antoine Caillon, Adam Roberts, Ethan Manilow, Philippe Esling, Andrea Agostinelli, Mauro Verzetti, Ian Simon, Olivier Pietquin, Neil Zeghidour, Jesse Engel
We present SingSong, a system that generates instrumental music to accompany input vocals, potentially offering musicians and non-musicians alike an intuitive new way to create music featuring their own voice.
1 code implementation • 28 Sep 2022 • Yusong Wu, Josh Gardner, Ethan Manilow, Ian Simon, Curtis Hawthorne, Jesse Engel
We call this system the Chamber Ensemble Generator (CEG), and use it to generate a large dataset of chorales from four different chamber ensembles (CocoChorales).
no code implementations • 26 Aug 2022 • Noah Schaffer, Boaz Cogan, Ethan Manilow, Max Morrison, Prem Seetharaman, Bryan Pardo
Despite phenomenal progress in recent years, state-of-the-art music separation systems produce source estimates with significant perceptual shortcomings, such as adding extraneous noise or removing harmonics.
1 code implementation • 11 Jun 2022 • Curtis Hawthorne, Ian Simon, Adam Roberts, Neil Zeghidour, Josh Gardner, Ethan Manilow, Jesse Engel
An ideal music synthesizer should be both interactive and expressive, generating high-fidelity audio in realtime for arbitrary combinations of instruments and notes.
1 code implementation • ICLR 2022 • Yusong Wu, Ethan Manilow, Yi Deng, Rigel Swavely, Kyle Kastner, Tim Cooijmans, Aaron Courville, Cheng-Zhi Anna Huang, Jesse Engel
Musical expression requires control of both what notes are played, and how they are performed.
2 code implementations • ICLR 2022 • Josh Gardner, Ian Simon, Ethan Manilow, Curtis Hawthorne, Jesse Engel
Automatic Music Transcription (AMT), inferring musical notes from raw audio, is a challenging task at the core of music understanding.
Ranked #2 on Music Transcription on Slakh2100
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
1 code implementation • 25 Oct 2021 • Ethan Manilow, Patrick O'Reilly, Prem Seetharaman, Bryan Pardo
We showcase an unsupervised method that repurposes deep models trained for music generation and music tagging for audio source separation, without any retraining.
no code implementations • 25 Oct 2021 • Hugo Flores Garcia, Aldo Aguilar, Ethan Manilow, Dmitry Vedenko, Bryan Pardo
We present a software framework that integrates neural networks into the popular open-source audio editing software, Audacity, with a minimal amount of developer effort.
2 code implementations • 19 Jul 2021 • Curtis Hawthorne, Ian Simon, Rigel Swavely, Ethan Manilow, Jesse Engel
Automatic Music Transcription has seen significant progress in recent years by training custom deep neural networks on large datasets.
1 code implementation • 14 Jul 2021 • Hugo Flores Garcia, Aldo Aguilar, Ethan Manilow, Bryan Pardo
Deep learning work on musical instrument recognition has generally focused on instrument classes for which we have abundant data.
no code implementations • 29 Sep 2020 • Ethan Manilow, Bryan Pardo
In this paper, we introduce a simple method that can separate arbitrary musical instruments from an audio mixture.
1 code implementation • 4 Sep 2020 • Verena Haunschmid, Ethan Manilow, Gerhard Widmer
Prior work on explainable models in MIR has generally used image processing tools to produce explanations for DNN predictions, but these are not necessarily musically meaningful, or can be listened to (which, arguably, is important in music).
2 code implementations • 2 Aug 2020 • Verena Haunschmid, Ethan Manilow, Gerhard Widmer
Deep neural networks (DNNs) are successfully applied in a wide variety of music information retrieval (MIR) tasks but their predictions are usually not interpretable.
no code implementations • 22 Oct 2019 • Ethan Manilow, Prem Seetharaman, Bryan Pardo
We present a single deep learning architecture that can both separate an audio recording of a musical mixture into constituent single-instrument recordings and transcribe these instruments into a human-readable format at the same time, learning a shared musical representation for both tasks.
no code implementations • 18 Sep 2019 • Ethan Manilow, Gordon Wichern, Prem Seetharaman, Jonathan Le Roux
In this paper, we present the synthesized Lakh dataset (Slakh) as a new tool for music source separation research.
1 code implementation • 2 Jul 2019 • Gordon Wichern, Joe Antognini, Michael Flynn, Licheng Richard Zhu, Emmett McQuinn, Dwight Crow, Ethan Manilow, Jonathan Le Roux
Recent progress in separating the speech signals from multiple overlapping speakers using a single audio channel has brought us closer to solving the cocktail party problem.
Ranked #15 on Speech Separation on WHAMR!