Audio Super-Resolution

13 papers with code • 4 benchmarks • 3 datasets

AUDIO SUPER-RESOLUTION or speech bandwidth extension (Upsampling Ratio = 2)

Most implemented papers

Audio Super Resolution using Neural Networks

kuleshov/audio-super-res 2 Aug 2017

We introduce a new audio processing technique that increases the sampling rate of signals such as speech or music using deep convolutional neural networks.

NU-Wave 2: A General Neural Audio Upsampling Model for Various Sampling Rates

mindslab-ai/nuwave2 17 Jun 2022

Conventionally, audio super-resolution models fixed the initial and the target sampling rates, which necessitate the model to be trained for each pair of sampling rates.

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling

mindslab-ai/nuwave 6 Apr 2021

In this work, we introduce NU-Wave, the first neural audio upsampling model to produce waveforms of sampling rate 48kHz from coarse 16kHz or 24kHz inputs, while prior works could generate only up to 16kHz.

On Filter Generalization for Music Bandwidth Extension Using Deep Neural Networks

serkansulun/deep-music-enhancer 14 Nov 2020

In this paper, we address a sub-topic of the broad domain of audio enhancement, namely musical audio bandwidth extension.

CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement

ruizhecao96/cmgan 22 Sep 2022

Convolution-augmented transformers (Conformers) are recently proposed in various speech-domain applications, such as automatic speech recognition (ASR) and speech separation, as they can capture both local and global dependencies.

Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations

leolya/Audio-Super-Resolution-Tensorflow2.0-TFiLM 14 Sep 2019

Learning representations that accurately capture long-range dependencies in sequential inputs -- including text, audio, and genomic data -- is a key problem in deep learning.

Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations.

kuleshov/audio-super-res NeurIPS 2019

Learning representations that accurately capture long-range dependencies in sequential inputs --- including text, audio, and genomic data --- is a key problem in deep learning.

Self-Attention for Audio Super-Resolution

ncarraz/AFILM 26 Aug 2021

Convolutions operate only locally, thus failing to model global interactions.

TUNet: A Block-online Bandwidth Extension Model based on Transformers and Self-supervised Pretraining

nxtproduct/tunet 26 Oct 2021

We introduce a block-online variant of the temporal feature-wise linear modulation (TFiLM) model to achieve bandwidth extension.

Learning Continuous Representation of Audio for Arbitrary Scale Super Resolution

ml-postech/lisa 30 Oct 2021

To obtain a continuous representation of audio and enable super resolution for arbitrary scale factor, we propose a method of implicit neural representation, coined Local Implicit representation for Super resolution of Arbitrary scale (LISA).