1 code implementation • 27 Feb 2023 • Ninon Devis, Nils Demerlé, Sarah Nabi, David Genova, Philippe Esling
Despite significant advances in deep models for music generation, the use of these techniques remains restricted to expert users.
no code implementations • 30 Jan 2023 • Chris Donahue, Antoine Caillon, Adam Roberts, Ethan Manilow, Philippe Esling, Andrea Agostinelli, Mauro Verzetti, Ian Simon, Olivier Pietquin, Neil Zeghidour, Jesse Engel
We present SingSong, a system that generates instrumental music to accompany input vocals, potentially offering musicians and non-musicians alike an intuitive new way to create music featuring their own voice.
no code implementations • 16 Nov 2022 • Axel Chemla--Romeu-Santos, Philippe Esling
The development of generative Machine Learning (ML) models in creative practices, enabled by the recent improvements in usability and availability of pre-trained models, is raising more and more interest among artists, practitioners and performers.
1 code implementation • 16 Nov 2022 • Axel Chemla--Romeu-Santos, Philippe Esling
Machine learning approaches now achieve impressive generation capabilities in numerous domains such as image, audio or video.
no code implementations • 14 Apr 2022 • Antoine Caillon, Philippe Esling
As our method is based on a post-training reconfiguration of the model, we show that it is able to transform models trained without causal constraints into a streaming model.
3 code implementations • 6 Mar 2022 • Joseph Turian, Jordie Shier, Humair Raj Khan, Bhiksha Raj, Björn W. Schuller, Christian J. Steinmetz, Colin Malloy, George Tzanetakis, Gissel Velarde, Kirk McNally, Max Henry, Nicolas Pinto, Camille Noufi, Christian Clough, Dorien Herremans, Eduardo Fonseca, Jesse Engel, Justin Salamon, Philippe Esling, Pranay Manocha, Shinji Watanabe, Zeyu Jin, Yonatan Bisk
The aim of the HEAR benchmark is to develop a general-purpose audio representation that provides a strong basis for learning in a wide variety of tasks and scenarios.
1 code implementation • 9 Nov 2021 • Antoine Caillon, Philippe Esling
By leveraging a multi-band decomposition of the raw waveform, we show that our model is the first able to generate 48kHz audio signals, while simultaneously running 20 times faster than real-time on a standard laptop CPU.
1 code implementation • 8 Sep 2021 • Mathieu Prang, Philippe Esling
We evaluate the ability to learn meaningful features from this representation from a musical point of view.
no code implementations • 6 Jul 2021 • Constance Douwes, Philippe Esling, Jean-Pierre Briot
In most scientific domains, the deep learning community has largely focused on the quality of deep generative models, resulting in highly accurate and successful solutions.
1 code implementation • 15 Apr 2021 • Théis Bazin, Gaëtan Hadjeres, Philippe Esling, Mikhail Malt
Modern approaches to sound synthesis using deep neural networks are hard to control, especially when fine-grained conditioning information is not available, hindering their adoption by musicians.
no code implementations • 13 Aug 2020 • Philippe Esling, Ninon Devis
As creativity is a highly context-prone concept, we underline the limits and deficiencies of current AI, requiring to move towards artificial creativity.
no code implementations • 4 Aug 2020 • Antoine Caillon, Adrien Bitton, Brice Gatinet, Philippe Esling
Recent studies show the ability of unsupervised models to learn invertible audio representations using Auto-Encoders.
no code implementations • 4 Aug 2020 • Adrien Bitton, Philippe Esling, Tatsuya Harada
In this setting the learned grain space is invertible, meaning that we can continuously synthesize sound when traversing its dimensions.
1 code implementation • 31 Jul 2020 • Philippe Esling, Theis Bazin, Adrien Bitton, Tristan Carsault, Ninon Devis
We show that our proposal can remove up to 90% of the model parameters without loss of accuracy, leading to ultra-light deep MIR models.
1 code implementation • 31 Jul 2020 • Philippe Esling, Ninon Devis, Adrien Bitton, Antoine Caillon, Axel Chemla--Romeu-Santos, Constance Douwes
This hypothesis states that extremely efficient small sub-networks exist in deep models and would provide higher accuracy than larger models if trained in isolation.
1 code implementation • 13 Jul 2020 • Adrien Bitton, Philippe Esling, Tatsuya Harada
Although its definition is usually elusive, it can be seen from a signal processing viewpoint as all the spectral features that are perceived independently from pitch and loudness.
no code implementations • 10 Feb 2020 • Axel Chemla--Romeu-Santos, Stavros Ntalampiras, Philippe Esling, Goffredo Haus, Gérard Assayag
Extraction of symbolic information from signals is an active field of research enabling numerous applications especially in the Musical Information Retrieval domain.
1 code implementation • 12 Nov 2019 • Tristan Carsault, Andrew McLeod, Philippe Esling, Jérôme Nika, Eita Nakamura, Kazuyoshi Yoshii
In this paper, we postulate that this comes from the multi-scale structure of musical information and propose new architectures based on an iterative temporal aggregation of input labels.
1 code implementation • 12 Nov 2019 • Tristan Carsault, Jérôme Nika, Philippe Esling
Recent researches on Automatic Chord Extraction (ACE) have focused on the improvement of models based on machine learning.
no code implementations • 4 Jul 2019 • Cyran Aouameur, Philippe Esling, Gaëtan Hadjeres
In this work, we introduce a system for real-time generation of drum sounds.
1 code implementation • Digital Audio Effects (DaFX) 2019 2019 • Philippe Esling, Naotake Masuda, Adrien Bardet, Romeo Despres, Axel Chemla--Romeu-Santos
By using this formulation, we show that we can address simultaneously automatic parameter inference, macro-control learning and audio-based preset exploration within a single model.
3 code implementations • 12 Apr 2019 • Adrien Bitton, Philippe Esling, Antoine Caillon, Martin Fouilleul
Its training data subsets can directly be visualized in the 3D latent representation.
no code implementations • 19 Oct 2018 • Léopold Crestel, Philippe Esling, Lena Heng, Stephen McAdams
This article introduces the Projective Orchestral Database (POD), a collection of MIDI scores composed of pairs linking piano scores to their corresponding orchestrations.
no code implementations • ICLR 2019 • Adrien Bitton, Philippe Esling, Axel Chemla--Romeu-Santos
We define timbre transfer as applying parts of the auditory properties of a musical instrument onto another.
Sound Audio and Speech Processing
1 code implementation • Conference 2018 • Philippe Esling, Axel Chemla--Romeu-Santos, Adrien Bitton
Based on this, we introduce a method for descriptor-based synthesis and show that we can control the descriptors of an instrument while keeping its timbre structure.
Sound Audio and Speech Processing
no code implementations • 5 Sep 2016 • Léopold Crestel, Philippe Esling
This paper introduces the first system for performing automatic orchestration based on a real-time piano input.