9 papers with code • 4 benchmarks • 2 datasets
AUDIO SUPER-RESOLUTION or speech bandwidth extension (Upsampling Ratio = 2)
In this paper, we address a sub-topic of the broad domain of audio enhancement, namely musical audio bandwidth extension.
In this work, we introduce NU-Wave, the first neural audio upsampling model to produce waveforms of sampling rate 48kHz from coarse 16kHz or 24kHz inputs, while prior works could generate only up to 16kHz.
Learning representations that accurately capture long-range dependencies in sequential inputs -- including text, audio, and genomic data -- is a key problem in deep learning.
Learning representations that accurately capture long-range dependencies in sequential inputs --- including text, audio, and genomic data --- is a key problem in deep learning.
TUNet: A Block-online Bandwidth Extension Model based on Transformers and Self-supervised Pretraining
We introduce a block-online variant of the temporal feature-wise linear modulation (TFiLM) model to achieve bandwidth extension.
To obtain a continuous representation of audio and enable super resolution for arbitrary scale factor, we propose a method of implicit neural representation, coined Local Implicit representation for Super resolution of Arbitrary scale (LISA).
In this paper, we propose a neural vocoder based speech super-resolution method (NVSR) that can handle a variety of input resolution and upsampling ratios.