Audio Compression

Most implemented papers

High-Fidelity Audio Compression with Improved RVQGAN

descriptinc/descript-audio-codec NeurIPS 2023

Language models have been successfully used to model natural signals, such as images, speech, and music.

High Fidelity Neural Audio Compression

facebookresearch/encodec 24 Oct 2022

We introduce a state-of-the-art real-time, high-fidelity, audio codec leveraging neural networks.

UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021

yzyouzhang/asvspoof2021_air 26 Jul 2021

Different from previous ASVspoof challenges, the LA task this year presents codec and transmission channel variability, while the new task DF presents general audio compression.

Bayesian Reconstruction of Fourier Pairs

lerkoah/brfp 9 Nov 2020

Our aim is to address the lack of a principled treatment of data acquired indistinctly in the temporal and frequency domains in a way that is robust to missing or noisy observations, and that at the same time models uncertainty effectively.

MP3net: coherent, minute-long music generation from raw audio with a simple convolutional GAN

korneelvdbroek/mp3net 12 Jan 2021

We present a deep convolutional GAN which leverages techniques from MP3/Vorbis audio compression to produce long, high-quality audio samples with long-range coherence.

ClefNet: Recurrent Autoencoders with Dynamic Time Warping for Near-Lossless Music Compression and Minimal-Latency Transmission

rvignav/ClefNet 15 Mar 2021

The onset of coronavirus disease 2019 (COVID-19), an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has sparked unprecedented change.

Audio Spectral Enhancement: Leveraging Autoencoders for Low Latency Reconstruction of Long, Lossy Audio Sequences

darshandeshpande/audio-spectral-enhancement 8 Aug 2021

With active research in audio compression techniques yielding substantial breakthroughs, spectral reconstruction of low-quality audio waves remains a less indulged topic.

Compression with Bayesian Implicit Neural Representations

cambridge-mlg/combiner NeurIPS 2023

Many common types of data can be represented as functions that map coordinates to signal values, such as pixel locations to RGB values in the case of an image.

DC CoMix TTS: An End-to-End Expressive TTS with Discrete Code Collaborated with Mixer

lakahaga/dc-comix-tts 31 May 2023

We demonstrate that the reference encoder learns better speaker-independent prosody when discrete code is utilized as input in the experiments.

Quantifying Spatial Audio Quality Impairment

karnwatcharasupat/spauq 13 Jun 2023

Spatial audio quality is a highly multifaceted concept, with many interactions between environmental, geometrical, anatomical, psychological, and contextual considerations.