Resynthesis
16 papers with code • 2 benchmarks • 2 datasets
Most implemented papers
textless-lib: a Library for Textless Spoken Language Processing
Textless spoken language processing research aims to extend the applicability of standard NLP toolset onto spoken language and languages with few or no textual resources.
A Perceptual Measure for Evaluating the Resynthesis of Automatic Music Transcriptions
This study focuses on the perception of music performances when contextual factors, such as room acoustics and instrument, change.
Analysing Discrete Self Supervised Speech Representation for Spoken Language Modeling
Following the findings of such an analysis, we propose practical improvements to the discrete unit for the GSLM.
Speaker-Independent Acoustic-to-Articulatory Speech Inversion
To build speech processing methods that can handle speech as naturally as humans, researchers have explored multiple ways of building an invertible mapping from speech to an interpretable space.
Weakly-supervised Contrastive Learning for Unsupervised Object Discovery
Unsupervised object discovery (UOD) refers to the task of discriminating the whole region of objects from the background within a scene without relying on labeled datasets, which benefits the task of bounding-box-level localization and pixel-level segmentation.
EmphAssess : a Prosodic Benchmark on Assessing Emphasis Transfer in Speech-to-Speech Models
We introduce EmphAssess, a prosodic benchmark designed to evaluate the capability of speech-to-speech models to encode and reproduce prosodic emphasis.