1 code implementation • 5 Apr 2024 • Krishna Subramani, Paris Smaragdis, Takuya Higuchi, Mehrez Souden
Non-negative Matrix Factorization (NMF) is a powerful technique for analyzing regularly-sampled data, i. e., data that can be stored in a matrix.
no code implementations • 25 Sep 2023 • Krishna Subramani, Jean-Marc Valin, Jan Buethe, Paris Smaragdis, Mike Goodwin
Pitch estimation is an essential step of many speech processing algorithms, including speech coding, synthesis, and enhancement.
1 code implementation • 3 May 2023 • Zhepei Wang, Cem Subakan, Krishna Subramani, Junkai Wu, Tiago Tavares, Fabio Ayres, Paris Smaragdis
In this paper, we study unsupervised approaches to improve the learning framework of such representations with unpaired text and audio.
1 code implementation • 23 Feb 2022 • Krishna Subramani, Jean-Marc Valin, Umut Isik, Paris Smaragdis, Arvindh Krishnaswamy
Neural vocoders have recently demonstrated high quality speech synthesis, but typically require a high computational complexity.
1 code implementation • 6 May 2021 • Krishna Subramani, Paris Smaragdis
As a consequence, most audio machine learning models are designed to process fixed-size vector inputs which often prohibits the repurposing of learned models on audio with different sampling rates or alternative representations.
1 code implementation • 28 Oct 2020 • An Zhao, Krishna Subramani, Paris Smaragdis
The Short-Time Fourier Transform (STFT) has been a staple of signal processing, often being the first step for many audio tasks.
no code implementations • 19 Aug 2020 • Krishna Subramani, Preeti Rao
Generative Models for Audio Synthesis have been gaining momentum in the last few years.
1 code implementation • 30 Mar 2020 • Krishna Subramani, Preeti Rao, Alexandre D'Hooge
With the advent of data-driven statistical modeling and abundant computing power, researchers are turning increasingly to deep learning for audio synthesis.
no code implementations • 15 Nov 2019 • Krishna Subramani, Alexandre D'Hooge, Preeti Rao
Use a parametric representation of audio to train a generative model in the interest of obtaining more flexible control over the generated sound.