Search Results for author: Björn Schuller

Found 53 papers, 19 papers with code

Nkululeko: A Tool For Rapid Speaker Characteristics Detection

no code implementations LREC 2022 Felix Burkhardt, Johannes Wagner, Hagen Wierstorf, Florian Eyben, Björn Schuller

We present advancements with a software tool called Nkululeko, that lets users perform (semi-) supervised machine learning experiments in the speaker characteristics domain.

Emotion Classification regression

The ACII 2022 Affective Vocal Bursts Workshop & Competition: Understanding a critically understudied modality of emotional expression

2 code implementations7 Jul 2022 Alice Baird, Panagiotis Tzirakis, Jeffrey A. Brooks, Christopher B. Gregory, Björn Schuller, Anton Batliner, Dacher Keltner, Alan Cowen

The ACII Affective Vocal Bursts Workshop & Competition is focused on understanding multiple affective dimensions of vocal bursts: laughs, gasps, cries, screams, and many other non-linguistic vocalizations central to the expression of emotion and to human communication more generally.

Cultural Vocal Bursts Intensity Prediction

Data Augmentation for Dementia Detection in Spoken Language

1 code implementation26 Jun 2022 Anna Hlédiková, Dominika Woszczyk, Alican Akman, Soteris Demetriou, Björn Schuller

In this work, we investigate data augmentation techniques for the task of AD detection and perform an empirical evaluation of the different approaches on two kinds of models for both the text and audio domains.

Data Augmentation

Accurate Emotion Strength Assessment for Seen and Unseen Speech Based on Data-Driven Deep Learning

1 code implementation15 Jun 2022 Rui Liu, Berrak Sisman, Björn Schuller, Guanglai Gao, Haizhou Li

In this paper, we propose a data-driven deep learning model, i. e. StrengthNet, to improve the generalization of emotion strength assessment for seen and unseen speech.

Emotion Classification Multi-Task Learning +1

An Estimation of Online Video User Engagement from Features of Continuous Emotions

no code implementations4 May 2021 Lukas Stappen, Alice Baird, Michelle Lienhart, Annalena Bätz, Björn Schuller

We investigate features extracted from these signals against various user engagement indicators including views, like/dislike ratio, as well as the sentiment of comments.

Time Series Analysis

An Enhanced Adversarial Network with Combined Latent Features for Spatio-Temporal Facial Affect Estimation in the Wild

1 code implementation18 Feb 2021 Decky Aspandi, Federico Sukno, Björn Schuller, Xavier Binefa

This paper addresses these shortcomings by proposing a novel model that efficiently extracts both spatial and temporal features of the data by means of its enhanced temporal modelling based on latent features.

Personalized Federated Deep Learning for Pain Estimation From Face Images

1 code implementation12 Jan 2021 Ognjen Rudovic, Nicolas Tobis, Sebastian Kaltwang, Björn Schuller, Daniel Rueckert, Jeffrey F. Cohn, Rosalind W. Picard

A potential approach to tackling this is Federated Learning (FL), which enables multiple parties to collaboratively learn a shared prediction model by using parameters of locally trained models while keeping raw training data locally.

Federated Learning

An Overview on Audio, Signal, Speech, & Language Processing for COVID-19

no code implementations18 May 2020 Gauri Deshpande, Björn Schuller

Recently, there has been an increased attention towards innovating, enhancing, building, and deploying applications of speech signal processing for providing assistance and relief to human mankind from the Coronavirus (COVID-19) pandemic.

Computers and Society Sound Audio and Speech Processing

Adversarial-based neural networks for affect estimations in the wild

no code implementations3 Feb 2020 Decky Aspandi, Adria Mallol-Ragolta, Björn Schuller, Xavier Binefa

However, the use of latent features, which is feasible through adversarial learning, is not largely explored, yet.

N-HANS: Introducing the Augsburg Neuro-Holistic Audio-eNhancement System

1 code implementation16 Nov 2019 Shuo Liu, Gil Keren, Björn Schuller

N-HANS is a Python toolkit for in-the-wild audio enhancement, including speech, music, and general audio denoising, separation, and selective noise or source suppression.

Sound Audio and Speech Processing

Poisson CNN: Convolutional neural networks for the solution of the Poisson equation on a Cartesian mesh

1 code implementation18 Oct 2019 Ali Girayhan Özbay, Arash Hamzehloo, Sylvain Laizet, Panagiotis Tzirakis, Georgios Rizos, Björn Schuller

The Poisson equation is commonly encountered in engineering, for instance in computational fluid dynamics (CFD) where it is needed to compute corrections to the pressure field to ensure the incompressibility of the velocity field.

On Laughter and Speech-Laugh, Based on Observations of Child-Robot Interaction

no code implementations30 Aug 2019 Anton Batliner, Stefan Steidl, Florian Eyben, Björn Schuller

In this article, we study laughter found in child-robot interaction where it had not been prompted intentionally.

General Classification

EmoBed: Strengthening Monomodal Emotion Recognition via Training with Crossmodal Emotion Embeddings

no code implementations23 Jul 2019 Jing Han, Zixing Zhang, Zhao Ren, Björn Schuller

Motivated by this, we propose a novel crossmodal emotion embedding framework called EmoBed, which aims to leverage the knowledge from other auxiliary modalities to improve the performance of an emotion recognition system at hand.

Emotion Classification Emotion Recognition

AVEC 2019 Workshop and Challenge: State-of-Mind, Detecting Depression with AI, and Cross-Cultural Affect Recognition

no code implementations10 Jul 2019 Fabien Ringeval, Björn Schuller, Michel Valstar, NIcholas Cummins, Roddy Cowie, Leili Tavabi, Maximilian Schmitt, Sina Alisamir, Shahin Amiriparian, Eva-Maria Messner, Siyang Song, Shuo Liu, Ziping Zhao, Adria Mallol-Ragolta, Zhao Ren, Mohammad Soleymani, Maja Pantic

The Audio/Visual Emotion Challenge and Workshop (AVEC 2019) "State-of-Mind, Detecting Depression with AI, and Cross-cultural Affect Recognition" is the ninth competition event aimed at the comparison of multimedia processing and machine learning methods for automatic audiovisual health and emotion analysis, with all participants competing strictly under the same conditions.

Emotion Recognition

Single-Channel Speech Separation with Auxiliary Speaker Embeddings

no code implementations24 Jun 2019 Shuo Liu, Gil Keren, Björn Schuller

We present a novel source separation model to decompose asingle-channel speech signal into two speech segments belonging to two different speakers.

Speech Separation

Synthesising 3D Facial Motion from "In-the-Wild" Speech

no code implementations15 Apr 2019 Panagiotis Tzirakis, Athanasios Papaioannou, Alexander Lattas, Michail Tarasiou, Björn Schuller, Stefanos Zafeiriou

Synthesising 3D facial motion from speech is a crucial problem manifesting in a multitude of applications such as computer games and movies.

Lip Reading Motion Synthesis

Voice command generation using Progressive Wavegans

no code implementations13 Mar 2019 Thomas Wiest, NIcholas Cummins, Alice Baird, Simone Hantke, Judith Dineley, Björn Schuller

Generative Adversarial Networks (GANs) have become exceedingly popular in a wide range of data-driven research fields, due in part to their success in image generation.

Audio Generation Image Generation

The Many-to-Many Mapping Between the Concordance Correlation Coefficient and the Mean Square Error

1 code implementation14 Feb 2019 Vedhas Pandit, Björn Schuller

Despite its drawbacks, $MSE$ is one of the most popular performance metrics (and a loss function); along with lately $\rho_c$ in many of the sequence prediction challenges.

Sentiment Analysis Time Series Analysis

Scaling Speech Enhancement in Unseen Environments with Noise Embeddings

no code implementations26 Oct 2018 Gil Keren, Jing Han, Björn Schuller

We address the problem of speech enhancement generalisation to unseen environments by performing two manipulations.

Speech Enhancement speech-recognition +1

Adversarial Training in Affective Computing and Sentiment Analysis: Recent Advances and Perspectives

no code implementations21 Sep 2018 Jing Han, Zixing Zhang, NIcholas Cummins, Björn Schuller

Over the past few years, adversarial training has become an extremely active research topic and has been successfully applied to various Artificial Intelligence (AI) domains.

Sentiment Analysis

audEERING's approach to the One-Minute-Gradual Emotion Challenge

no code implementations3 May 2018 Andreas Triantafyllopoulos, Hesam Sagha, Florian Eyben, Björn Schuller

This paper describes audEERING's submissions as well as additional evaluations for the One-Minute-Gradual (OMG) emotion recognition challenge.

Emotion Recognition

Calibrated Prediction Intervals for Neural Network Regressors

1 code implementation26 Mar 2018 Gil Keren, NIcholas Cummins, Björn Schuller

Despite their obvious aforementioned advantage in relation to accuracy, contemporary neural networks can, generally, be regarded as poorly calibrated and as such do not produce reliable output probability estimates.

Prediction Intervals

Applying Cooperative Machine Learning to Speed Up the Annotation of Social Signals in Large Multi-modal Corpora

1 code implementation7 Feb 2018 Johannes Wagner, Tobias Baur, Yue Zhang, Michel F. Valstar, Björn Schuller, Elisabeth André

Scientific disciplines, such as Behavioural Psychology, Anthropology and recently Social Signal Processing are concerned with the systematic exploration of human behaviour.

Weakly Supervised One-Shot Detection with Attention Similarity Networks

no code implementations10 Jan 2018 Gil Keren, Maximilian Schmitt, Thomas Kehrenberg, Björn Schuller

Neural network models that are not conditioned on class identities were shown to facilitate knowledge transfer between classes and to be well-suited for one-shot learning tasks.

One-Shot Learning Transfer Learning

The Principle of Logit Separation

no code implementations ICLR 2018 Gil Keren, Sivan Sabato, Björn Schuller

In contrast, there are known loss functions, as well as novel batch loss functions that we propose, which are aligned with this principle.

Image Retrieval

auDeep: Unsupervised Learning of Representations from Audio with Deep Recurrent Neural Networks

1 code implementation12 Dec 2017 Michael Freitag, Shahin Amiriparian, Sergey Pugachevskiy, NIcholas Cummins, Björn Schuller

auDeep is a Python toolkit for deep unsupervised representation learning from acoustic data.

Sound Audio and Speech Processing

Learning audio sequence representations for acoustic event classification

no code implementations27 Jul 2017 Zixing Zhang, Ding Liu, Jing Han, Kun Qian, Björn Schuller

Extensive evaluation on a large-size acoustic event database is performed, and the empirical results demonstrate that the learnt audio sequence representation yields a significant performance improvement by a large margin compared with other state-of-the-art hand-crafted sequence features for AEC.

Classification General Classification

Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments

no code implementations30 May 2017 Zixing Zhang, Jürgen Geiger, Jouni Pohjalainen, Amr El-Desoky Mousa, Wenyu Jin, Björn Schuller

Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition that stills remains an important challenge.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Fast Single-Class Classification and the Principle of Logit Separation

2 code implementations29 May 2017 Gil Keren, Sivan Sabato, Björn Schuller

Our experiments show that indeed in almost all cases, losses that are aligned with the Principle of Logit Separation obtain at least 20% relative accuracy improvement in the SLC task compared to losses that are not aligned with it, and sometimes considerably more.

Binary Classification Classification +2

End-to-End Multimodal Emotion Recognition using Deep Neural Networks

2 code implementations27 Apr 2017 Panagiotis Tzirakis, George Trigeorgis, Mihalis A. Nicolaou, Björn Schuller, Stefanos Zafeiriou

The system is then trained in an end-to-end fashion where - by also taking advantage of the correlations of the each of the streams - we manage to significantly outperform the traditional approaches based on auditory and visual handcrafted features for the prediction of spontaneous and natural emotions on the RECOLA database of the AVEC 2016 research challenge on emotion recognition.

Multimodal Emotion Recognition Retrieval

Deep Structured Learning for Facial Action Unit Intensity Estimation

no code implementations CVPR 2017 Robert Walecki, Ognjen, Rudovic, Vladimir Pavlovic, Björn Schuller, Maja Pantic

The goal of this paper is to model these structures and estimate complex feature representations simultaneously by combining conditional random field (CRF) encoded AU dependencies with deep learning.

Tunable Sensitivity to Large Errors in Neural Network Training

no code implementations23 Nov 2016 Gil Keren, Sivan Sabato, Björn Schuller

We propose incorporating this idea of tunable sensitivity for hard examples in neural network learning, using a new generalization of the cross-entropy gradient step, which can be used in place of the gradient in any gradient-based training method.

Convolutional RNN: an Enhanced Model for Extracting Features from Sequential Data

3 code implementations18 Feb 2016 Gil Keren, Björn Schuller

Traditional convolutional layers extract features from patches of data by applying a non-linearity on an affine function of the input.

Audio Classification

Detecting Road Surface Wetness from Audio: A Deep Learning Approach

no code implementations22 Nov 2015 Irman Abdić, Lex Fridman, Erik Marchi, Daniel E. Brown, William Angell, Bryan Reimer, Björn Schuller

We introduce a recurrent neural network architecture for automated road surface wetness detection from audio of tire-surface interaction.

General Classification

A Broadcast News Corpus for Evaluation and Tuning of German LVCSR Systems

no code implementations15 Dec 2014 Felix Weninger, Björn Schuller, Florian Eyben, Martin Wöllmer, Gerhard Rigoll

Transcription of broadcast news is an interesting and challenging application for large-vocabulary continuous speech recognition (LVCSR).

speech-recognition Speech Recognition

Acoustic Gait-based Person Identification using Hidden Markov Models

no code implementations11 Jun 2014 Jürgen T. Geiger, Maximilian Kneißl, Björn Schuller, Gerhard Rigoll

The goal of the system is to analyse sounds emitted by walking persons (mostly the step sounds) and identify those persons.

Gait Recognition Person Identification

6th International Symposium on Attention in Cognitive Systems 2013

no code implementations22 Jul 2013 Lucas Paletta, Laurent Itti, Björn Schuller, Fang Fang

This volume contains the papers accepted at the 6th International Symposium on Attention in Cognitive Systems (ISACS 2013), held in Beijing, August 5, 2013.

Cannot find the paper you are looking for? You can Submit a new open access paper.