Browse > Audio > Audio Generation

Audio Generation

15 papers with code · Audio

Audio generation (synthesis) is the task of generating raw audio such as speech.

( Image credit: MelNet )

Leaderboards

Latest papers with code

DDSP: Differentiable Digital Signal Processing

ICLR 2020 magenta/ddsp

In this paper, we introduce the Differentiable Digital Signal Processing (DDSP) library, which enables direct integration of classic signal processing elements with deep learning methods.

AUDIO GENERATION

1,311
14 Jan 2020

Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion

NeurIPS 2019 joansj/blow

End-to-end models for raw audio generation are a challenge, specially if they have to work with non-parallel data, which is a desirable setup in many situations.

AUDIO GENERATION VOICE CONVERSION

105
01 Dec 2019

Seq-U-Net: A One-Dimensional Causal U-Net for Efficient Sequence Modelling

14 Nov 2019f90/Seq-U-Net

In comparison to TCN and Wavenet, our network consistently saves memory and computation time, with speed-ups for training and inference of over 4x in the audio generation experiment in particular, while achieving a comparable performance in all tasks.

AUDIO GENERATION

29
14 Nov 2019

MelNet: A Generative Model for Audio in the Frequency Domain

ICLR 2020 fatchord/MelNet

Capturing high-level structure in audio waveforms is challenging because a single second of audio spans tens of thousands of timesteps.

AUDIO GENERATION MUSIC GENERATION SPEECH SYNTHESIS TEXT-TO-SPEECH SYNTHESIS

209
04 Jun 2019

Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion

NeurIPS 2019 liusongxiang/StarGAN-Voice-Conversion

End-to-end models for raw audio generation are a challenge, specially if they have to work with non-parallel data, which is a desirable setup in many situations.

AUDIO GENERATION VOICE CONVERSION

261
03 Jun 2019

Assisted Sound Sample Generation with Musical Conditioning in Adversarial Auto-Encoders

12 Apr 2019acids-ircam/Expressive_WAE_FADER

Its training data subsets can directly be visualized in the 3D latent representation.

AUDIO GENERATION

4
12 Apr 2019

GANSynth: Adversarial Neural Audio Synthesis

ICLR 2019 tensorflow/magenta

Efficient audio synthesis is an inherently difficult machine learning task, as human perception is sensitive to both global structure and fine-scale waveform coherence.

AUDIO GENERATION

15,086
23 Feb 2019

Conditional WaveGAN

27 Sep 2018acheketa/cwavegan

Generative models are successfully used for image synthesis in the recent years.

AUDIO GENERATION

99
27 Sep 2018

Smoothed Dilated Convolutions for Improved Dense Prediction

27 Aug 2018divelab/dilated

Unlike existing models, which explore solutions by focusing on a block of cascaded dilated convolutional layers, our methods address the gridding artifacts by smoothing the dilated convolution itself.

AUDIO GENERATION MACHINE TRANSLATION OBJECT DETECTION SEMANTIC SEGMENTATION

64
27 Aug 2018

Generative timbre spaces: regularizing variational auto-encoders with perceptual metrics

Conference 2018 acids-ircam/variational-timbre

Based on this, we introduce a method for descriptor-based synthesis and show that we can control the descriptors of an instrument while keeping its timbre structure.

AUDIO CLASSIFICATION AUDIO GENERATION MUSIC INFORMATION RETRIEVAL MUSIC MODELING

25
22 May 2018