Audio Synthesis

44 papers with code • 1 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Audio Synthesis models and implementations

Most implemented papers

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

locuslab/TCN 4 Mar 2018

Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory.

Tacotron: Towards End-to-End Speech Synthesis

CorentinJ/Real-Time-Voice-Cloning 29 Mar 2017

A text-to-speech synthesis system typically consists of multiple stages, such as a text analysis frontend, an acoustic model and an audio synthesis module.

Adversarial Audio Synthesis

chrisdonahue/wavegan ICLR 2019

Audio signals are sampled at high temporal resolutions, and learning to synthesize audio requires capturing structure across a range of timescales.

Efficient Neural Audio Synthesis

CorentinJ/Real-Time-Voice-Cloning ICML 2018

The small number of weights in a Sparse WaveRNN makes it possible to sample high-fidelity audio on a mobile CPU in real time.

DiffWave: A Versatile Diffusion Model for Audio Synthesis

lmnt-com/diffwave ICLR 2021

In this work, we propose DiffWave, a versatile diffusion probabilistic model for conditional and unconditional waveform generation.

Differentiable All-pole Filters for Time-varying Audio Systems

yoyololicon/torchcomp 11 Apr 2024

Infinite impulse response filters are an essential building block of many time-varying audio systems, such as audio effects and synthesisers.

Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders

facebookresearch/SING ICML 2017

Generative models in vision have seen rapid progress due to algorithmic improvements and the availability of high-quality image datasets.

GANSynth: Adversarial Neural Audio Synthesis

tensorflow/magenta ICLR 2019

Efficient audio synthesis is an inherently difficult machine learning task, as human perception is sensitive to both global structure and fine-scale waveform coherence.

CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation

ermongroup/csdi NeurIPS 2021

In this paper, we propose Conditional Score-based Diffusion models for Imputation (CSDI), a novel time series imputation method that utilizes score-based diffusion models conditioned on observed data.

Deep Voice: Real-time Neural Text-to-Speech

NVIDIA/nv-wavenet ICML 2017

We present Deep Voice, a production-quality text-to-speech system constructed entirely from deep neural networks.