Audio Synthesis

44 papers with code • 1 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Audio Synthesis

Trend	Dataset	Best Model	Paper	Code	Compare
	Trinity Speech-Gesture Dataset	Match-TTSG			See all

Libraries

Use these libraries to find Audio Synthesis models and implementations

coqui-ai/TTS

3 papers

29,239

PaddlePaddle/PaddleSpeech

3 papers

10,142

CorentinJ/Real-Time-Voice-Cloning

2 papers

50,743

tigthor/Voice-Cloning-AI

2 papers

Datasets

Most implemented papers

Most implemented Social Latest No code

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

locuslab/TCN • • 4 Mar 2018

Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory.

Paper
Code

Tacotron: Towards End-to-End Speech Synthesis

CorentinJ/Real-Time-Voice-Cloning • • 29 Mar 2017

A text-to-speech synthesis system typically consists of multiple stages, such as a text analysis frontend, an acoustic model and an audio synthesis module.

Paper
Code

Adversarial Audio Synthesis

chrisdonahue/wavegan • • ICLR 2019

Audio signals are sampled at high temporal resolutions, and learning to synthesize audio requires capturing structure across a range of timescales.

Paper
Code

Efficient Neural Audio Synthesis

CorentinJ/Real-Time-Voice-Cloning • • ICML 2018

The small number of weights in a Sparse WaveRNN makes it possible to sample high-fidelity audio on a mobile CPU in real time.

Paper
Code

DiffWave: A Versatile Diffusion Model for Audio Synthesis

lmnt-com/diffwave • • ICLR 2021

In this work, we propose DiffWave, a versatile diffusion probabilistic model for conditional and unconditional waveform generation.

Paper
Code

Differentiable All-pole Filters for Time-varying Audio Systems

yoyololicon/torchcomp • • 11 Apr 2024

Infinite impulse response filters are an essential building block of many time-varying audio systems, such as audio effects and synthesisers.

Paper
Code

Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders

facebookresearch/SING • • ICML 2017

Generative models in vision have seen rapid progress due to algorithmic improvements and the availability of high-quality image datasets.

Paper
Code

GANSynth: Adversarial Neural Audio Synthesis

tensorflow/magenta • • ICLR 2019

Efficient audio synthesis is an inherently difficult machine learning task, as human perception is sensitive to both global structure and fine-scale waveform coherence.

Paper
Code

CSDI: Conditional Score-based Diffusion Models for Probabilistic Time Series Imputation

ermongroup/csdi • • NeurIPS 2021

In this paper, we propose Conditional Score-based Diffusion models for Imputation (CSDI), a novel time series imputation method that utilizes score-based diffusion models conditioned on observed data.

Paper
Code

Deep Voice: Real-time Neural Text-to-Speech

NVIDIA/nv-wavenet • • ICML 2017

We present Deep Voice, a production-quality text-to-speech system constructed entirely from deep neural networks.

Paper
Code

Audio Synthesis

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result