no code implementations • LREC 2022 • Felix Burkhardt, Johannes Wagner, Hagen Wierstorf, Florian Eyben, Björn Schuller
We present advancements with a software tool called Nkululeko, that lets users perform (semi-) supervised machine learning experiments in the speaker characteristics domain.
no code implementations • DCLRL (LREC) 2022 • Felix Burkhardt, Florian Eyben, Björn Schuller
Speech emotion recognition is in the focus of research since several decades and has many applications.
no code implementations • LREC 2022 • Felix Burkhardt, Anabell Hacker, Uwe Reichel, Hagen Wierstorf, Florian Eyben, Björn Schuller
Since several decades emotional databases have been recorded by various laboratories.
no code implementations • 23 May 2023 • Thejan Rajapakshe, Rajib Rana, Sara Khalifa, Berrak Sisman, Björn Schuller
In this paper, we propose DARTS for a joint CNN and LSTM architecture for improving SER performance.
no code implementations • 27 Oct 2022 • Alice Baird, Panagiotis Tzirakis, Jeffrey A. Brooks, Christopher B. Gregory, Björn Schuller, Anton Batliner, Dacher Keltner, Alan Cowen
This is the Proceedings of the ACII Affective Vocal Bursts Workshop and Competition (A-VB).
no code implementations • 14 Jul 2022 • Alice Baird, Panagiotis Tzirakis, Gauthier Gidel, Marco Jiralerspong, Eilif B. Muller, Kory Mathewson, Björn Schuller, Erik Cambria, Dacher Keltner, Alan Cowen
The first, ExVo-MultiTask, requires participants to train a multi-task model to recognize expressed emotions and demographic traits from vocal bursts.
2 code implementations • 7 Jul 2022 • Alice Baird, Panagiotis Tzirakis, Jeffrey A. Brooks, Christopher B. Gregory, Björn Schuller, Anton Batliner, Dacher Keltner, Alan Cowen
The ACII Affective Vocal Bursts Workshop & Competition is focused on understanding multiple affective dimensions of vocal bursts: laughs, gasps, cries, screams, and many other non-linguistic vocalizations central to the expression of emotion and to human communication more generally.
1 code implementation • 26 Jun 2022 • Anna Hlédiková, Dominika Woszczyk, Alican Akman, Soteris Demetriou, Björn Schuller
In this work, we investigate data augmentation techniques for the task of AD detection and perform an empirical evaluation of the different approaches on two kinds of models for both the text and audio domains.
1 code implementation • 15 Jun 2022 • Rui Liu, Berrak Sisman, Björn Schuller, Guanglai Gao, Haizhou Li
In this paper, we propose a data-driven deep learning model, i. e. StrengthNet, to improve the generalization of emotion strength assessment for seen and unseen speech.
2 code implementations • 3 May 2022 • Alice Baird, Panagiotis Tzirakis, Gauthier Gidel, Marco Jiralerspong, Eilif B. Muller, Kory Mathewson, Björn Schuller, Erik Cambria, Dacher Keltner, Alan Cowen
ExVo 2022, includes three competition tracks using a large-scale dataset of 59, 201 vocalizations from 1, 702 speakers.
1 code implementation • 16 Sep 2021 • Sandra Ottl, Shahin Amiriparian, Maurice Gerczuk, Björn Schuller
Finally, a linear SVR is trained on this feature representation.
no code implementations • 4 May 2021 • Lukas Stappen, Alice Baird, Michelle Lienhart, Annalena Bätz, Björn Schuller
We investigate features extracted from these signals against various user engagement indicators including views, like/dislike ratio, as well as the sentiment of comments.
no code implementations • 19 Apr 2021 • Shuo Liu, Jing Han, Estela Laporta Puyal, Spyridon Kontaxis, Shaoxiong Sun, Patrick Locatelli, Judith Dineley, Florian B. Pokorny, Gloria Dalla Costa, Letizia Leocan, Ana Isabel Guerrero, Carlos Nos, Ana Zabalza, Per Soelberg Sørensen, Mathias Buron, Melinda Magyari, Yatharth Ranjan, Zulqarnain Rashid, Pauline Conde, Callum Stewart, Amos A Folarin, Richard JB Dobson, Raquel Bailón, Srinivasan Vairavan, NIcholas Cummins, Vaibhav A Narayan, Matthew Hotopf, Giancarlo Comi, Björn Schuller
This study investigates the potential of deep learning methods to identify individuals with suspected COVID-19 infection using remotely collected heart-rate data.
1 code implementation • 10 Mar 2021 • Maurice Gerczuk, Shahin Amiriparian, Sandra Ottl, Björn Schuller
The corpus is then utilised to create a novel framework for multi-corpus speech emotion recognition, namely EmoNet.
1 code implementation • 18 Feb 2021 • Decky Aspandi, Federico Sukno, Björn Schuller, Xavier Binefa
This paper addresses these shortcomings by proposing a novel model that efficiently extracts both spatial and temporal features of the data by means of its enhanced temporal modelling based on latent features.
no code implementations • 15 Jan 2021 • Lukas Stappen, Alice Baird, Lea Schumann, Björn Schuller
Truly real-life data presents a strong, but exciting challenge for sentiment and emotion research.
1 code implementation • 12 Jan 2021 • Ognjen Rudovic, Nicolas Tobis, Sebastian Kaltwang, Björn Schuller, Daniel Rueckert, Jeffrey F. Cohn, Rosalind W. Picard
A potential approach to tackling this is Federated Learning (FL), which enables multiple parties to collaboratively learn a shared prediction model by using parameters of locally trained models while keeping raw training data locally.
no code implementations • 18 May 2020 • Gauri Deshpande, Björn Schuller
Recently, there has been an increased attention towards innovating, enhancing, building, and deploying applications of speech signal processing for providing assistance and relief to human mankind from the Coronavirus (COVID-19) pandemic.
Computers and Society Sound Audio and Speech Processing
no code implementations • 13 Apr 2020 • Lukas Stappen, Fabian Brunn, Björn Schuller
Detecting hate speech, especially in low-resource languages, is a non-trivial challenge.
no code implementations • 5 Mar 2020 • Kazi Nazmul Haque, Rajib Rana, John H. L. Hansen, Björn Schuller
However, the model can become redundant if it is intended for a specific task.
no code implementations • 3 Feb 2020 • Decky Aspandi, Adria Mallol-Ragolta, Björn Schuller, Xavier Binefa
However, the use of latent features, which is feasible through adversarial learning, is not largely explored, yet.
1 code implementation • 16 Nov 2019 • Shuo Liu, Gil Keren, Björn Schuller
N-HANS is a Python toolkit for in-the-wild audio enhancement, including speech, music, and general audio denoising, separation, and selective noise or source suppression.
Sound Audio and Speech Processing
1 code implementation • 18 Oct 2019 • Ali Girayhan Özbay, Arash Hamzehloo, Sylvain Laizet, Panagiotis Tzirakis, Georgios Rizos, Björn Schuller
The Poisson equation is commonly encountered in engineering, for instance in computational fluid dynamics (CFD) where it is needed to compute corrections to the pressure field to ensure the incompressibility of the velocity field.
no code implementations • 30 Aug 2019 • Anton Batliner, Stefan Steidl, Florian Eyben, Björn Schuller
In this article, we study laughter found in child-robot interaction where it had not been prompted intentionally.
no code implementations • 23 Jul 2019 • Jing Han, Zixing Zhang, Zhao Ren, Björn Schuller
Motivated by this, we propose a novel crossmodal emotion embedding framework called EmoBed, which aims to leverage the knowledge from other auxiliary modalities to improve the performance of an emotion recognition system at hand.
no code implementations • 10 Jul 2019 • Fabien Ringeval, Björn Schuller, Michel Valstar, NIcholas Cummins, Roddy Cowie, Leili Tavabi, Maximilian Schmitt, Sina Alisamir, Shahin Amiriparian, Eva-Maria Messner, Siyang Song, Shuo Liu, Ziping Zhao, Adria Mallol-Ragolta, Zhao Ren, Mohammad Soleymani, Maja Pantic
The Audio/Visual Emotion Challenge and Workshop (AVEC 2019) "State-of-Mind, Detecting Depression with AI, and Cross-cultural Affect Recognition" is the ninth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic audiovisual health and emotion analysis, with all participants competing strictly under the same conditions.
no code implementations • 24 Jun 2019 • Shuo Liu, Gil Keren, Björn Schuller
We present a novel source separation model to decompose asingle-channel speech signal into two speech segments belonging to two different speakers.
no code implementations • 15 Apr 2019 • Panagiotis Tzirakis, Athanasios Papaioannou, Alexander Lattas, Michail Tarasiou, Björn Schuller, Stefanos Zafeiriou
Synthesising 3D facial motion from speech is a crucial problem manifesting in a multitude of applications such as computer games and movies.
no code implementations • 13 Mar 2019 • Thomas Wiest, NIcholas Cummins, Alice Baird, Simone Hantke, Judith Dineley, Björn Schuller
Generative Adversarial Networks (GANs) have become exceedingly popular in a wide range of data-driven research fields, due in part to their success in image generation.
1 code implementation • 14 Feb 2019 • Vedhas Pandit, Björn Schuller
Despite its drawbacks, $MSE$ is one of the most popular performance metrics (and a loss function); along with lately $\rho_c$ in many of the sequence prediction challenges.
no code implementations • 26 Oct 2018 • Gil Keren, Jing Han, Björn Schuller
We address the problem of speech enhancement generalisation to unseen environments by performing two manipulations.
no code implementations • 21 Sep 2018 • Jing Han, Zixing Zhang, NIcholas Cummins, Björn Schuller
Over the past few years, adversarial training has become an extremely active research topic and has been successfully applied to various Artificial Intelligence (AI) domains.
1 code implementation • 3 May 2018 • Siyang Song, Shuimei Zhang, Björn Schuller, Linlin Shen, Michel Valstar
The performance of speaker-related systems usually degrades heavily in practical applications largely due to the presence of background noise.
no code implementations • 3 May 2018 • Andreas Triantafyllopoulos, Hesam Sagha, Florian Eyben, Björn Schuller
This paper describes audEERING's submissions as well as additional evaluations for the One-Minute-Gradual (OMG) emotion recognition challenge.
1 code implementation • 29 Apr 2018 • Dimitrios Kollias, Panagiotis Tzirakis, Mihalis A. Nicolaou, Athanasios Papaioannou, Guoying Zhao, Björn Schuller, Irene Kotsia, Stefanos Zafeiriou
Automatic understanding of human affect using visual signals is of great importance in everyday human-machine interactions.
1 code implementation • 26 Mar 2018 • Gil Keren, NIcholas Cummins, Björn Schuller
Despite their obvious aforementioned advantage in relation to accuracy, contemporary neural networks can, generally, be regarded as poorly calibrated and as such do not produce reliable output probability estimates.
1 code implementation • 7 Feb 2018 • Johannes Wagner, Tobias Baur, Yue Zhang, Michel F. Valstar, Björn Schuller, Elisabeth André
Scientific disciplines, such as Behavioural Psychology, Anthropology and recently Social Signal Processing are concerned with the systematic exploration of human behaviour.
no code implementations • 10 Jan 2018 • Gil Keren, Maximilian Schmitt, Thomas Kehrenberg, Björn Schuller
Neural network models that are not conditioned on class identities were shown to facilitate knowledge transfer between classes and to be well-suited for one-shot learning tasks.
no code implementations • ICLR 2018 • Gil Keren, Sivan Sabato, Björn Schuller
In contrast, there are known loss functions, as well as novel batch loss functions that we propose, which are aligned with this principle.
1 code implementation • 12 Dec 2017 • Michael Freitag, Shahin Amiriparian, Sergey Pugachevskiy, NIcholas Cummins, Björn Schuller
auDeep is a Python toolkit for deep unsupervised representation learning from acoustic data.
Sound Audio and Speech Processing
no code implementations • 27 Jul 2017 • Zixing Zhang, Ding Liu, Jing Han, Kun Qian, Björn Schuller
Extensive evaluation on a large-size acoustic event database is performed, and the empirical results demonstrate that the learnt audio sequence representation yields a significant performance improvement by a large margin compared with other state-of-the-art hand-crafted sequence features for AEC.
no code implementations • 30 May 2017 • Zixing Zhang, Jürgen Geiger, Jouni Pohjalainen, Amr El-Desoky Mousa, Wenyu Jin, Björn Schuller
Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition that stills remains an important challenge.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
2 code implementations • 29 May 2017 • Gil Keren, Sivan Sabato, Björn Schuller
Our experiments show that indeed in almost all cases, losses that are aligned with the Principle of Logit Separation obtain at least 20% relative accuracy improvement in the SLC task compared to losses that are not aligned with it, and sometimes considerably more.
2 code implementations • 27 Apr 2017 • Panagiotis Tzirakis, George Trigeorgis, Mihalis A. Nicolaou, Björn Schuller, Stefanos Zafeiriou
The system is then trained in an end-to-end fashion where - by also taking advantage of the correlations of the each of the streams - we manage to significantly outperform the traditional approaches based on auditory and visual handcrafted features for the prediction of spontaneous and natural emotions on the RECOLA database of the AVEC 2016 research challenge on emotion recognition.
no code implementations • CVPR 2017 • Robert Walecki, Ognjen, Rudovic, Vladimir Pavlovic, Björn Schuller, Maja Pantic
The goal of this paper is to model these structures and estimate complex feature representations simultaneously by combining conditional random field (CRF) encoded AU dependencies with deep learning.
no code implementations • 23 Nov 2016 • Gil Keren, Sivan Sabato, Björn Schuller
We propose incorporating this idea of tunable sensitivity for hard examples in neural network learning, using a new generalization of the cross-entropy gradient step, which can be used in place of the gradient in any gradient-based training method.
3 code implementations • 18 Feb 2016 • Gil Keren, Björn Schuller
Traditional convolutional layers extract features from patches of data by applying a non-linearity on an affine function of the input.
no code implementations • 22 Nov 2015 • Irman Abdić, Lex Fridman, Erik Marchi, Daniel E. Brown, William Angell, Bryan Reimer, Björn Schuller
We introduce a recurrent neural network architecture for automated road surface wetness detection from audio of tire-surface interaction.
no code implementations • 15 Dec 2014 • Felix Weninger, Björn Schuller, Florian Eyben, Martin Wöllmer, Gerhard Rigoll
Transcription of broadcast news is an interesting and challenging application for large-vocabulary continuous speech recognition (LVCSR).
no code implementations • 11 Jun 2014 • Jürgen T. Geiger, Maximilian Kneißl, Björn Schuller, Gerhard Rigoll
The goal of the system is to analyse sounds emitted by walking persons (mostly the step sounds) and identify those persons.
no code implementations • 24 Mar 2014 • Björn Schuller, Erik Marchi, Simon Baron-Cohen, Helen O'Reilly, Delia Pigat, Peter Robinson, Ian Daves
Individuals with Autism Spectrum Conditions (ASC) have marked difficulties using verbal and non-verbal communication for social interaction.
no code implementations • 22 Jul 2013 • Lucas Paletta, Laurent Itti, Björn Schuller, Fang Fang
This volume contains the papers accepted at the 6th International Symposium on Attention in Cognitive Systems (ISACS 2013), held in Beijing, August 5, 2013.