Search Results for author: Mikolaj Kegler

Found 9 papers, 7 papers with code

CATSE: A Context-Aware Framework for Causal Target Sound Extraction

no code implementations21 Mar 2024 Shrishail Baligar, Mikolaj Kegler, Bryce Irvin, Marko Stamenovic, Shawn Newsam

First, we explore the utility of context by providing the TSE model with oracle information about what sound classes make up the input mixture, where the objective of the model is to extract one or more sources of interest indicated by the user.

Target Sound Extraction

Self-Supervised Learning for Speech Enhancement through Synthesis

1 code implementation4 Nov 2022 Bryce Irvin, Marko Stamenovic, Mikolaj Kegler, Li-Chia Yang

Modern speech enhancement (SE) networks typically implement noise suppression through time-frequency masking, latent representation masking, or discriminative signal prediction.

Denoising Self-Supervised Learning +2

BYOL-S: Learning Self-supervised Speech Representations by Bootstrapping

1 code implementation24 Jun 2022 Gasser Elbanna, Neil Scheidwasser-Clow, Mikolaj Kegler, Pierre Beckmann, Karl El Hajal, Milos Cernak

Our results indicate that the hybrid model with a convolutional transformer as the encoder yields superior performance in most HEAR challenge tasks.

Scene Classification Self-Supervised Learning

SERAB: A multi-lingual benchmark for speech emotion recognition

2 code implementations7 Oct 2021 Neil Scheidwasser-Clow, Mikolaj Kegler, Pierre Beckmann, Milos Cernak

To facilitate the process, here, we present the Speech Emotion Recognition Adaptation Benchmark (SERAB), a framework for evaluating the performance and generalization capacity of different approaches for utterance-level SER.

Benchmarking Speech Emotion Recognition

Deep speech inpainting of time-frequency masks

2 code implementations20 Oct 2019 Mikolaj Kegler, Pierre Beckmann, Milos Cernak

To address these limitations, here we propose an end-to-end framework for speech inpainting, the context-based retrieval of missing or severely distorted parts of time-frequency representation of speech.

Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.