We propose autoencoding speaker conversion for training data augmentation in automatic speech translation. This technique directly transforms an audio sequence, resulting in audio synthesized to resemble another speaker's voice... (read more)
PDFMETHOD | TYPE | |
---|---|---|
🤖 No Methods Found | Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet |