Browse > Audio > Audio Generation

Audio Generation

12 papers with code · Audio

Audio generation (synthesis) is the task of generating raw audio such as speech.

State-of-the-art leaderboards

Greatest papers with code

GANSynth: Adversarial Neural Audio Synthesis

ICLR 2019 tensorflow/magenta

Efficient audio synthesis is an inherently difficult machine learning task, as human perception is sensitive to both global structure and fine-scale waveform coherence.

AUDIO GENERATION

WaveNet: A Generative Model for Raw Audio

12 Sep 2016maciejkula/spotlight

This paper introduces WaveNet, a deep neural network for generating raw audio waveforms.

AUDIO GENERATION SPEECH SYNTHESIS

Generating Long Sequences with Sparse Transformers

Preprint 2019 openai/sparse_attention

Transformers are powerful sequence models, but require time and memory that grows quadratically with the sequence length.

AUDIO GENERATION IMAGE GENERATION LANGUAGE MODELLING

Adversarial Audio Synthesis

ICLR 2019 chrisdonahue/wavegan

Audio signals are sampled at high temporal resolutions, and learning to synthesize audio requires capturing structure across a range of timescales.

AUDIO GENERATION IMAGE GENERATION

SampleRNN: An Unconditional End-to-End Neural Audio Generation Model

22 Dec 2016soroushmehr/sampleRNN_ICLR2017

In this paper we propose a novel model for unconditional audio generation based on generating one audio sample at a time.

AUDIO GENERATION

Audio Super Resolution using Neural Networks

2 Aug 2017kuleshov/audio-super-res

We introduce a new audio processing technique that increases the sampling rate of signals such as speech or music using deep convolutional neural networks.

AUDIO SUPER-RESOLUTION

MelNet: A Generative Model for Audio in the Frequency Domain

4 Jun 2019fatchord/MelNet

Capturing high-level structure in audio waveforms is challenging because a single second of audio spans tens of thousands of timesteps.

AUDIO GENERATION MUSIC GENERATION SPEECH SYNTHESIS TEXT-TO-SPEECH SYNTHESIS

Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion

3 Jun 2019liusongxiang/StarGAN-Voice-Conversion

End-to-end models for raw audio generation are a challenge, specially if they have to work with non-parallel data, which is a desirable setup in many situations.

AUDIO GENERATION VOICE CONVERSION

Conditional WaveGAN

27 Sep 2018acheketa/cwavegan

Generative models are successfully used for image synthesis in the recent years.

AUDIO GENERATION

Smoothed Dilated Convolutions for Improved Dense Prediction

27 Aug 2018divelab/dilated

Unlike existing models, which explore solutions by focusing on a block of cascaded dilated convolutional layers, our methods address the gridding artifacts by smoothing the dilated convolution itself.

AUDIO GENERATION MACHINE TRANSLATION OBJECT DETECTION SEMANTIC SEGMENTATION