Audio Artifact Removal

Phase Shuffle

Introduced by Donahue et al. in Adversarial Audio Synthesis

Phase Shuffle is a technique for removing pitched noise artifacts that come from using transposed convolutions in audio generation models. Phase shuffle is an operation with hyperparameter $n$. It randomly perturbs the phase of each layer’s activations by −$n$ to $n$ samples before input to the next layer.

In the original application in WaveGAN, the authors only apply phase shuffle to the discriminator, as the latent vector already provides the generator a mechanism to manipulate the phase of a resultant waveform. Intuitively speaking, phase shuffle makes the discriminator’s job more challenging by requiring invariance to the phase of the input waveform.

Source: Adversarial Audio Synthesis

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Text to Speech 9 23.68%
Speech Synthesis 4 10.53%
Image Generation 3 7.89%
Voice Conversion 3 7.89%
Translation 2 5.26%
Audio Generation 2 5.26%
Singing Voice Synthesis 2 5.26%
Decoder 2 5.26%
Speech-to-Speech Translation 1 2.63%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories