no code implementations • 10 Sep 2021 • Wentao Yu, Steffen Zeiler, Dorothea Kolossa
To address the inherent difficulties, we propose a new fusion strategy: a recurrent integration network is trained to fuse the state posteriors of multiple single-modality models, guided by a set of model-based and signal-based stream reliability measures.
no code implementations • 19 Apr 2021 • Wentao Yu, Steffen Zeiler, Dorothea Kolossa
While audio-visual speech recognition can significantly improve the recognition rate of end-to-end models in such poor conditions, it is not obvious how to best utilize any available information on acoustic and visual signal quality and reliability in these models.
no code implementations • 1 Mar 2021 • Benedikt Boenninghoff, Robert M. Nickel, Steffen Zeiler, Dorothea Kolossa
The detection of voiced speech, the estimation of the fundamental frequency, and the tracking of pitch values over time are crucial subtasks for a variety of speech processing techniques.
no code implementations • COLING 2020 • Benedikt Boenninghoff, Steffen Zeiler, Robert Nickel, Dorothea Kolossa
In this work, we propose a probabilistic autoencoding framework to deal with this supervised classification task.
no code implementations • 28 May 2020 • Benedikt Boenninghoff, Steffen Zeiler, Robert M. Nickel, Dorothea Kolossa
In this work, we are extending the Gaussian model for the VAE to a Student-$t$ model, which allows for an independent control of the "heaviness" of the respective tails of the implied probability densities.
2 code implementations • 20 Aug 2019 • Benedikt Boenninghoff, Robert M. Nickel, Steffen Zeiler, Dorothea Kolossa
Authorship verification tries to answer the question if two documents with unknown authors were written by the same author or not.
no code implementations • 5 Aug 2019 • Lea Schönherr, Thorsten Eisenhofer, Steffen Zeiler, Thorsten Holz, Dorothea Kolossa
In this paper, we demonstrate the first algorithm that produces generic adversarial examples, which remain robust in an over-the-air attack that is not adapted to the specific environment.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 16 Aug 2018 • Lea Schönherr, Katharina Kohls, Steffen Zeiler, Thorsten Holz, Dorothea Kolossa
We use this backpropagation to learn the degrees of freedom for the adversarial perturbation of the input signal, i. e., we apply a psychoacoustic model and manipulate the acoustic signal below the thresholds of human perception.
Cryptography and Security Sound Audio and Speech Processing