Search Results for author: Björn W. Schuller

Found 88 papers, 27 papers with code

openXBOW - Introducing the Passau Open-Source Crossmodal Bag-of-Words Toolkit

1 code implementation22 May 2016 Maximilian Schmitt, Björn W. Schuller

We introduce openXBOW, an open-source toolkit for the generation of bag-of-words (BoW) representations from multimodal input.

Document Classification Emotion Recognition +2

SVTS: Scalable Video-to-Speech Synthesis

2 code implementations4 May 2022 Rodrigo Mira, Alexandros Haliassos, Stavros Petridis, Björn W. Schuller, Maja Pantic

Video-to-speech synthesis (also known as lip-to-speech) refers to the translation of silent lip movements into the corresponding audio.

Speech Synthesis

The MuSe 2021 Multimodal Sentiment Analysis Challenge: Sentiment, Emotion, Physiological-Emotion, and Stress

1 code implementation14 Apr 2021 Lukas Stappen, Alice Baird, Lukas Christ, Lea Schumann, Benjamin Sertolli, Eva-Maria Messner, Erik Cambria, Guoying Zhao, Björn W. Schuller

Multimodal Sentiment Analysis (MuSe) 2021 is a challenge focusing on the tasks of sentiment and emotion, as well as physiological-emotion and emotion-based stress recognition through more comprehensively integrating the audio-visual, language, and biological signal modalities.

Emotion Recognition Multimodal Sentiment Analysis

The MuSe 2022 Multimodal Sentiment Analysis Challenge: Humor, Emotional Reactions, and Stress

1 code implementation23 Jun 2022 Lukas Christ, Shahin Amiriparian, Alice Baird, Panagiotis Tzirakis, Alexander Kathan, Niklas Müller, Lukas Stappen, Eva-Maria Meßner, Andreas König, Alan Cowen, Erik Cambria, Björn W. Schuller

For this year's challenge, we feature three datasets: (i) the Passau Spontaneous Football Coach Humor (Passau-SFCH) dataset that contains audio-visual recordings of German football coaches, labelled for the presence of humour; (ii) the Hume-Reaction dataset in which reactions of individuals to emotional stimuli have been annotated with respect to seven emotional expression intensities, and (iii) the Ulm-Trier Social Stress Test (Ulm-TSST) dataset comprising of audio-visual data labelled with continuous emotion values (arousal and valence) of people in stressful dispositions.

Emotion Recognition Humor Detection +1

Speech Emotion Recognition using Semantic Information

1 code implementation4 Mar 2021 Panagiotis Tzirakis, Anh Nguyen, Stefanos Zafeiriou, Björn W. Schuller

In this paper, we propose a novel framework that can capture both the semantic and the paralinguistic information in the signal.

Speech Emotion Recognition Sound Audio and Speech Processing

audb -- Sharing and Versioning of Audio and Annotation Data in Python

1 code implementation1 Mar 2023 Hagen Wierstorf, Johannes Wagner, Florian Eyben, Felix Burkhardt, Björn W. Schuller

Driven by the need for larger and more diverse datasets to pre-train and fine-tune increasingly complex machine learning models, the number of datasets is rapidly growing.

Management

MuSe 2020 -- The First International Multimodal Sentiment Analysis in Real-life Media Challenge and Workshop

1 code implementation30 Apr 2020 Lukas Stappen, Alice Baird, Georgios Rizos, Panagiotis Tzirakis, Xinchen Du, Felix Hafner, Lea Schumann, Adria Mallol-Ragolta, Björn W. Schuller, Iulia Lefter, Erik Cambria, Ioannis Kompatsiaris

Multimodal Sentiment Analysis in Real-life Media (MuSe) 2020 is a Challenge-based Workshop focusing on the tasks of sentiment recognition, as well as emotion-target engagement and trustworthiness detection by means of more comprehensively integrating the audio-visual and language modalities.

Emotion Recognition Multimodal Sentiment Analysis

DeepSpectrumLite: A Power-Efficient Transfer Learning Framework for Embedded Speech and Audio Processing from Decentralised Data

1 code implementation23 Apr 2021 Shahin Amiriparian, Tobias Hübner, Maurice Gerczuk, Sandra Ottl, Björn W. Schuller

By obtaining state-of-the-art results on a set of paralinguistics tasks, we demonstrate the suitability of the proposed transfer learning approach for embedded audio signal processing, even when data is scarce.

Audio Signal Processing Transfer Learning

End-2-End COVID-19 Detection from Breath & Cough Audio

1 code implementation7 Jan 2021 Harry Coppock, Alexander Gaskell, Panagiotis Tzirakis, Alice Baird, Lyn Jones, Björn W. Schuller

Our main contributions are as follows: (I) We demonstrate the first attempt to diagnose COVID-19 using end-to-end deep learning from a crowd-sourced dataset of audio samples, achieving ROC-AUC of 0. 846; (II) Our model, the COVID-19 Identification ResNet, (CIdeR), has potential for rapid scalability, minimal cost and improving performance as more data becomes available.

An Improved StarGAN for Emotional Voice Conversion: Enhancing Voice Quality and Data Augmentation

1 code implementation18 Jul 2021 Xiangheng He, Junjie Chen, Georgios Rizos, Björn W. Schuller

Emotional Voice Conversion (EVC) aims to convert the emotional style of a source speech signal to a target style while preserving its content and speaker identity information.

Data Augmentation Generative Adversarial Network +2

A Novel Fusion of Attention and Sequence to Sequence Autoencoders to Predict Sleepiness From Speech

1 code implementation15 May 2020 Shahin Amiriparian, Pawel Winokurow, Vincent Karas, Sandra Ottl, Maurice Gerczuk, Björn W. Schuller

On the development partition of the data, we achieve Spearman's correlation coefficients of . 324, . 283, and . 320 with the targets on the Karolinska Sleepiness Scale by utilising attention and non-attention autoencoders, and the fusion of both autoencoders' representations, respectively.

Machine Translation Representation Learning

GraphTMT: Unsupervised Graph-based Topic Modeling from Video Transcripts

1 code implementation4 May 2021 Lukas Stappen, Jason Thies, Gerhard Hagerer, Björn W. Schuller, Georg Groh

To unfold the tremendous amount of multimedia data uploaded daily to social media platforms, effective topic modeling techniques are needed.

Clustering Topic Models +1

Example-based Explanations with Adversarial Attacks for Respiratory Sound Analysis

1 code implementation30 Mar 2022 Yi Chang, Zhao Ren, Thanh Tam Nguyen, Wolfgang Nejdl, Björn W. Schuller

Respiratory sound classification is an important tool for remote screening of respiratory-related diseases such as pneumonia, asthma, and COVID-19.

Sound Classification

Towards Multimodal Prediction of Spontaneous Humour: A Novel Dataset and First Results

1 code implementation28 Sep 2022 Lukas Christ, Shahin Amiriparian, Alexander Kathan, Niklas Müller, Andreas König, Björn W. Schuller

Our findings suggest that for the automatic analysis of humour and its sentiment, facial expressions are most promising, while humour direction can be best modelled via text-based features.

Exploring speaker enrolment for few-shot personalisation in emotional vocalisation prediction

1 code implementation14 Jun 2022 Andreas Triantafyllopoulos, Meishu Song, Zijiang Yang, Xin Jing, Björn W. Schuller

In this work, we explore a novel few-shot personalisation architecture for emotional vocalisation prediction.

Knowledge Transfer For On-Device Speech Emotion Recognition with Neural Structured Learning

1 code implementation26 Oct 2022 Yi Chang, Zhao Ren, Thanh Tam Nguyen, Kun Qian, Björn W. Schuller

Our experiments demonstrate that training a lightweight SER model on the target dataset with speech samples and graphs can not only produce small SER models, but also enhance the model performance compared to models with speech samples only and those using classic transfer learning strategies.

Speech Emotion Recognition Transfer Learning

Synthia's Melody: A Benchmark Framework for Unsupervised Domain Adaptation in Audio

1 code implementation26 Sep 2023 Chia-Hsin Lin, Charles Jones, Björn W. Schuller, Harry Coppock

Despite significant advancements in deep learning for vision and natural language, unsupervised domain adaptation in audio remains relatively unexplored.

Attribute Selection bias +1

Automatic Emotion Modelling in Written Stories

1 code implementation21 Dec 2022 Lukas Christ, Shahin Amiriparian, Manuel Milling, Ilhan Aslan, Björn W. Schuller

Telling stories is an integral part of human communication which can evoke emotions and influence the affective states of the audience.

Pre-training in Deep Reinforcement Learning for Automatic Speech Recognition

no code implementations24 Oct 2019 Thejan Rajapakshe, Rajib Rana, Siddique Latif, Sara Khalifa, Björn W. Schuller

Deep reinforcement learning (deep RL) is a combination of deep learning with reinforcement learning principles to create efficient methods that can learn by interacting with its environment.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Deep Representation Learning in Speech Processing: Challenges, Recent Advances, and Future Trends

no code implementations2 Jan 2020 Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Junaid Qadir, Björn W. Schuller

Research on speech processing has traditionally considered the task of designing hand-engineered acoustic features (feature engineering) as a separate distinct problem from the task of designing efficient machine learning (ML) models to make prediction and classification decisions.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

COVID-19 and Computer Audition: An Overview on What Speech & Sound Analysis Could Contribute in the SARS-CoV-2 Corona Crisis

no code implementations24 Mar 2020 Björn W. Schuller, Dagmar M. Schuller, Kun Qian, Juan Liu, Huaiyuan Zheng, Xiao Li

We come to the conclusion that CA appears ready for implementation of (pre-)diagnosis and monitoring tools, and more generally provides rich and significant, yet so far untapped potential in the fight against COVID-19 spread.

An Early Study on Intelligent Analysis of Speech under COVID-19: Severity, Sleep Quality, Fatigue, and Anxiety

no code implementations30 Apr 2020 Jing Han, Kun Qian, Meishu Song, Zijiang Yang, Zhao Ren, Shuo Liu, Juan Liu, Huaiyuan Zheng, Wei Ji, Tomoya Koike, Xiao Li, Zixing Zhang, Yoshiharu Yamamoto, Björn W. Schuller

In particular, by analysing speech recordings from these patients, we construct audio-only-based models to automatically categorise the health state of patients from four aspects, including the severity of illness, sleep quality, fatigue, and anxiety.

Sleep Quality

deepSELF: An Open Source Deep Self End-to-End Learning Framework

no code implementations11 May 2020 Tomoya Koike, Kun Qian, Björn W. Schuller, Yoshiharu Yamamoto

To the best of our knowledge, it is the first public toolkit assembling a series of state-of-the-art deep learning technologies.

Image Generation

On Deep Speech Packet Loss Concealment: A Mini-Survey

no code implementations15 May 2020 Mostafa M. Mohamed, Mina A. Nessiem, Björn W. Schuller

In this mini-survey, we review all the literature we found to date, that attempt to solve the packet-loss in speech using deep learning methods.

Packet Loss Concealment

ConcealNet: An End-to-end Neural Network for Packet Loss Concealment in Deep Speech Emotion Recognition

no code implementations15 May 2020 Mostafa M. Mohamed, Björn W. Schuller

Additionally, extending this with an end-to-end emotion prediction neural network provides a network that performs SER from audio with lost frames, end-to-end.

Packet Loss Concealment Speech Emotion Recognition

High-Fidelity Audio Generation and Representation Learning with Guided Adversarial Autoencoder

no code implementations1 Jun 2020 Kazi Nazmul Haque, Rajib Rana, Björn W. Schuller

Hence, with the extensive experimental results, we have demonstrated that by harnessing the power of the high-fidelity audio generation, the proposed GAAE model can learn powerful representation from unlabelled dataset leveraging a fewer percentage of labelled data as supervision/guidance.

Audio Generation Representation Learning +1

Domain Adaptation with Joint Learning for Generic, Optical Car Part Recognition and Detection Systems (Go-CaRD)

no code implementations15 Jun 2020 Lukas Stappen, Xinchen Du, Vincent Karas, Stefan Müller, Björn W. Schuller

Systems for the automatic recognition and detection of automotive parts are crucial in several emerging research areas in the development of intelligent vehicles.

Benchmarking Domain Adaptation +1

MeDaS: An open-source platform as service to help break the walls between medicine and informatics

no code implementations12 Jul 2020 Liang Zhang, Johann Li, Ping Li, Xiaoyuan Lu, Peiyi Shen, Guangming Zhu, Syed Afaq Shah, Mohammed Bennarmoun, Kun Qian, Björn W. Schuller

To the best of our knowledge, MeDaS is the first open-source platform proving a collaborative and interactive service for researchers from a medical background easily using DL related toolkits, and at the same time for scientists or engineers from information sciences to understand the medical knowledge side.

Audio, Speech, Language, & Signal Processing for COVID-19: A Comprehensive Overview

no code implementations29 Nov 2020 Gauri Deshpande, Björn W. Schuller

This drives the research focus towards identifying the markers of COVID-19 in speech and other human generated audio signals.

The voice of COVID-19: Acoustic correlates of infection

no code implementations17 Dec 2020 Katrin D. Bartl-Pokorny, Florian B. Pokorny, Anton Batliner, Shahin Amiriparian, Anastasia Semertzidou, Florian Eyben, Elena Kramer, Florian Schmidt, Rainer Schönweiler, Markus Wehler, Björn W. Schuller

Group differences in the front vowels /i:/ and /e:/ are additionally reflected in the variation of the fundamental frequency and the harmonics-to-noise ratio, group differences in back vowels /o:/ and /u:/ in statistics of the Mel-frequency cepstral coefficients and the spectral slope.

Detecting COVID-19 from Breathing and Coughing Sounds using Deep Neural Networks

no code implementations29 Dec 2020 Björn W. Schuller, Harry Coppock, Alexander Gaskell

The COVID-19 pandemic has affected the world unevenly; while industrial economies have been able to produce the tests necessary to track the spread of the virus and mostly avoided complete lockdowns, developing countries have faced issues with testing capacity.

Bayesian Optimisation

The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates

no code implementations24 Feb 2021 Björn W. Schuller, Anton Batliner, Christian Bergler, Cecilia Mascolo, Jing Han, Iulia Lefter, Heysem Kaya, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Maurice Gerczuk, Panagiotis Tzirakis, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Leon J. M. Rothkrantz, Joeri Zwerts, Jelle Treep, Casper Kaandorp

The INTERSPEECH 2021 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the COVID-19 Cough and COVID-19 Speech Sub-Challenges, a binary classification on COVID-19 infection has to be made based on coughing sounds and speech; in the Escalation SubChallenge, a three-way assessment of the level of escalation in a dialogue is featured; and in the Primates Sub-Challenge, four species vs background need to be classified.

Binary Classification Representation Learning

End-to-End Video-To-Speech Synthesis using Generative Adversarial Networks

no code implementations27 Apr 2021 Rodrigo Mira, Konstantinos Vougioukas, Pingchuan Ma, Stavros Petridis, Björn W. Schuller, Maja Pantic

In this work, we propose a new end-to-end video-to-speech model based on Generative Adversarial Networks (GANs) which translates spoken video to waveform end-to-end without using any intermediate representation or separate waveform synthesis algorithm.

Lip Reading Speech Synthesis

LiRA: Learning Visual Speech Representations from Audio through Self-supervision

no code implementations16 Jun 2021 Pingchuan Ma, Rodrigo Mira, Stavros Petridis, Björn W. Schuller, Maja Pantic

The large amount of audiovisual content being shared online today has drawn substantial attention to the prospect of audiovisual self-supervised learning.

Lip Reading Self-Supervised Learning +1

A Physiologically-Adapted Gold Standard for Arousal during Stress

no code implementations27 Jul 2021 Alice Baird, Lukas Stappen, Lukas Christ, Lea Schumann, Eva-Maria Meßner, Björn W. Schuller

We utilise a Long Short-Term Memory, Recurrent Neural Network to explore the benefit of fusing these physiological signals with arousal as the target, learning from various audio, video, and textual based features.

Evaluating the COVID-19 Identification ResNet (CIdeR) on the INTERSPEECH COVID-19 from Audio Challenges

no code implementations30 Jul 2021 Alican Akman, Harry Coppock, Alexander Gaskell, Panagiotis Tzirakis, Lyn Jones, Björn W. Schuller

We report on cross-running the recent COVID-19 Identification ResNet (CIdeR) on the two Interspeech 2021 COVID-19 diagnosis from cough and speech audio challenges: ComParE and DiCOVA.

COVID-19 Diagnosis

Fairness and underspecification in acoustic scene classification: The case for disaggregated evaluations

no code implementations4 Oct 2021 Andreas Triantafyllopoulos, Manuel Milling, Konstantinos Drossos, Björn W. Schuller

Although these factors play a well-understood role in the performance of ASC models, most works report single evaluation metrics taking into account all different strata of a particular dataset.

Acoustic Scene Classification Fairness +1

Multistage linguistic conditioning of convolutional layers for speech emotion recognition

no code implementations13 Oct 2021 Andreas Triantafyllopoulos, Uwe Reichel, Shuo Liu, Stephan Huber, Florian Eyben, Björn W. Schuller

In this contribution, we investigate the effectiveness of deep fusion of text and audio features for categorical and dimensional speech emotion recognition (SER).

Speech Emotion Recognition

EIHW-MTG DiCOVA 2021 Challenge System Report

no code implementations13 Oct 2021 Adria Mallol-Ragolta, Helena Cuesta, Emilia Gómez, Björn W. Schuller

This paper aims to automatically detect COVID-19 patients by analysing the acoustic information embedded in coughs.

Facial Emotion Recognition using Deep Residual Networks in Real-World Environments

no code implementations4 Nov 2021 Panagiotis Tzirakis, Dénes Boros, Elnar Hajiyev, Björn W. Schuller

To show the favourable properties of our pre-trained model on modelling facial affect, we use the RECOLA database, and compare with the current state-of-the-art approach.

Facial Emotion Recognition

Normalise for Fairness: A Simple Normalisation Technique for Fairness in Regression Machine Learning Problems

no code implementations2 Feb 2022 Mostafa M. Mohamed, Björn W. Schuller

We present a theoretical analysis of the method, in addition to an empirical comparison against two standard methods for fairness, namely data balancing and adversarial training.

Binary Classification Decision Making +2

Robust Federated Learning Against Adversarial Attacks for Speech Emotion Recognition

no code implementations9 Mar 2022 Yi Chang, Sofiane Laridi, Zhao Ren, Gregory Palmer, Björn W. Schuller, Marco Fisichella

The proposed framework consists of i) federated learning for data privacy, and ii) adversarial training at the training stage and randomisation at the testing stage for model robustness.

Federated Learning Speech Emotion Recognition

Climate Change & Computer Audition: A Call to Action and Overview on Audio Intelligence to Help Save the Planet

no code implementations10 Mar 2022 Björn W. Schuller, Alican Akman, Yi Chang, Harry Coppock, Alexander Gebhard, Alexander Kathan, Esther Rituerto-González, Andreas Triantafyllopoulos, Florian B. Pokorny

We categorise potential computer audition applications according to the five elements of earth, water, air, fire, and aether, proposed by the ancient Greeks in their five element theory; this categorisation serves as a framework to discuss computer audition in relation to different ecological aspects.

Continuous-Time Audiovisual Fusion with Recurrence vs. Attention for In-The-Wild Affect Recognition

no code implementations24 Mar 2022 Vincent Karas, Mani Kumar Tellamekala, Adria Mallol-Ragolta, Michel Valstar, Björn W. Schuller

To clearly understand the performance differences between recurrent and attention models in audiovisual affect recognition, we present a comprehensive evaluation of fusion models based on LSTM-RNNs, self-attention and cross-modal attention, trained for valence and arousal estimation.

Arousal Estimation Multimodal Emotion Recognition

An Overview & Analysis of Sequence-to-Sequence Emotional Voice Conversion

no code implementations29 Mar 2022 Zijiang Yang, Xin Jing, Andreas Triantafyllopoulos, Meishu Song, Ilhan Aslan, Björn W. Schuller

Emotional voice conversion (EVC) focuses on converting a speech utterance from a source to a target emotion; it can thus be a key enabling technology for human-computer interaction applications and beyond.

Voice Conversion

A Temporal-oriented Broadcast ResNet for COVID-19 Detection

no code implementations31 Mar 2022 Xin Jing, Shuo Liu, Emilia Parada-Cabaleiro, Andreas Triantafyllopoulos, Meishu Song, Zijiang Yang, Björn W. Schuller

Detecting COVID-19 from audio signals, such as breathing and coughing, can be used as a fast and efficient pre-testing method to reduce the virus transmission.

Computational Efficiency

Probing Speech Emotion Recognition Transformers for Linguistic Knowledge

no code implementations1 Apr 2022 Andreas Triantafyllopoulos, Johannes Wagner, Hagen Wierstorf, Maximilian Schmitt, Uwe Reichel, Florian Eyben, Felix Burkhardt, Björn W. Schuller

Large, pre-trained neural networks consisting of self-attention layers (transformers) have recently achieved state-of-the-art results on several speech emotion recognition (SER) datasets.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Journaling Data for Daily PHQ-2 Depression Prediction and Forecasting

no code implementations6 May 2022 Alexander Kathan, Andreas Triantafyllopoulos, Xiangheng He, Manuel Milling, Tianhao Yan, Srividya Tirunellai Rajamani, Ludwig Küster, Mathias Harrer, Elena Heber, Inga Grossmann, David D. Ebert, Björn W. Schuller

Digital health applications are becoming increasingly important for assessing and monitoring the wellbeing of people suffering from mental health conditions like depression.

Fatigue Prediction in Outdoor Running Conditions using Audio Data

no code implementations9 May 2022 Andreas Triantafyllopoulos, Sandra Ottl, Alexander Gebhard, Esther Rituerto-González, Mirko Jaumann, Steffen Hüttner, Valerie Dieter, Patrick Schneeweiß, Inga Krauß, Maurice Gerczuk, Shahin Amiriparian, Björn W. Schuller

Although running is a common leisure activity and a core training regiment for several athletes, between $29\%$ and $79\%$ of runners sustain an overuse injury each year.

The ACM Multimedia 2022 Computational Paralinguistics Challenge: Vocalisations, Stuttering, Activity, & Mosquitoes

no code implementations13 May 2022 Björn W. Schuller, Anton Batliner, Shahin Amiriparian, Christian Bergler, Maurice Gerczuk, Natalie Holz, Pauline Larrouy-Maestri, Sebastian P. Bayerl, Korbinian Riedhammer, Adria Mallol-Ragolta, Maria Pateraki, Harry Coppock, Ivan Kiskin, Marianne Sinka, Stephen Roberts

The ACM Multimedia 2022 Computational Paralinguistics Challenge addresses four different problems for the first time in a research competition under well-defined conditions: In the Vocalisations and Stuttering Sub-Challenges, a classification on human non-verbal vocalisations and speech has to be made; the Activity Sub-Challenge aims at beyond-audio human activity recognition from smartwatch sensor data; and in the Mosquitoes Sub-Challenge, mosquitoes need to be detected.

Human Activity Recognition

COVYT: Introducing the Coronavirus YouTube and TikTok speech dataset featuring the same speakers with and without infection

no code implementations20 Jun 2022 Andreas Triantafyllopoulos, Anastasia Semertzidou, Meishu Song, Florian B. Pokorny, Björn W. Schuller

As compared to other existing COVID-19 sound datasets, the unique feature of the COVYT dataset is that it comprises both COVID-19 positive and negative samples from all 65 speakers.

Are 3D Face Shapes Expressive Enough for Recognising Continuous Emotions and Action Unit Intensities?

no code implementations3 Jul 2022 Mani Kumar Tellamekala, Ömer Sümer, Björn W. Schuller, Elisabeth André, Timo Giesbrecht, Michel Valstar

We also study how 3D face shapes performed on AU intensity estimation on BP4D and DISFA datasets, and report that 3D face features were on par with 2D appearance features in AUs 4, 6, 10, 12, and 25, but not the entire set of AUs.

3D Face Alignment Arousal Estimation +1

Computational Charisma -- A Brick by Brick Blueprint for Building Charismatic Artificial Intelligence

no code implementations31 Dec 2022 Björn W. Schuller, Shahin Amiriparian, Anton Batliner, Alexander Gebhard, Maurice Gerzcuk, Vincent Karas, Alexander Kathan, Lennart Seizer, Johanna Löchner

We then name exemplary use cases of computational charismatic skills before switching to ethical aspects and concluding this overview and perspective on building charisma-enabled AI.

A Comprehensive Survey on Heart Sound Analysis in the Deep Learning Era

no code implementations23 Jan 2023 Zhao Ren, Yi Chang, Thanh Tam Nguyen, Yang Tan, Kun Qian, Björn W. Schuller

Deep learning has been successfully applied to heart sound analysis in the past years.

Will Affective Computing Emerge from Foundation Models and General AI? A First Evaluation on ChatGPT

no code implementations3 Mar 2023 Mostafa M. Amin, Erik Cambria, Björn W. Schuller

We utilise three baselines, a robust language model (RoBERTa-base), a legacy word model with pretrained embeddings (Word2Vec), and a simple bag-of-words baseline (BoW).

Language Modelling Sentiment Analysis +2

The ACM Multimedia 2023 Computational Paralinguistics Challenge: Emotion Share & Requests

no code implementations28 Apr 2023 Björn W. Schuller, Anton Batliner, Shahin Amiriparian, Alexander Barnhill, Maurice Gerczuk, Andreas Triantafyllopoulos, Alice Baird, Panagiotis Tzirakis, Chris Gagne, Alan S. Cowen, Nikola Lackovic, Marie-José Caraty, Claude Montacié

The ACM Multimedia 2023 Computational Paralinguistics Challenge addresses two different problems for the first time in a research competition under well-defined conditions: In the Emotion Share Sub-Challenge, a regression on speech has to be made; and in the Requests Sub-Challenges, requests and complaints need to be detected.

regression

Integrating Generative Artificial Intelligence in Intelligent Vehicle Systems

no code implementations15 May 2023 Lukas Stappen, Jeremy Dillmann, Serena Striegel, Hans-Jörg Vögel, Nicolas Flores-Herr, Björn W. Schuller

This paper aims to serve as a comprehensive guide for researchers and practitioners, offering insights into the current state, potential applications, and future research directions for generative artificial intelligence and foundation models within the context of intelligent vehicles.

Ethics

Can ChatGPT's Responses Boost Traditional Natural Language Processing?

1 code implementation6 Jul 2023 Mostafa M. Amin, Erik Cambria, Björn W. Schuller

In this work, we extend this by exploring if ChatGPT has novel knowledge that would enhance existing specialised models when they are fused together.

Language Modelling Sentiment Analysis

A Wide Evaluation of ChatGPT on Affective Computing Tasks

no code implementations26 Aug 2023 Mostafa M. Amin, Rui Mao, Erik Cambria, Björn W. Schuller

In this work, we widely study the capabilities of the ChatGPT models, namely GPT-4 and GPT-3. 5, on 13 affective computing problems, namely aspect extraction, aspect polarity classification, opinion extraction, sentiment analysis, sentiment intensity ranking, emotions intensity ranking, suicide tendency detection, toxicity detection, well-being assessment, engagement measurement, personality assessment, sarcasm detection, and subjectivity detection.

Aspect Extraction Sarcasm Detection +1

Testing Speech Emotion Recognition Machine Learning Models

no code implementations11 Dec 2023 Anna Derington, Hagen Wierstorf, Ali Özkil, Florian Eyben, Felix Burkhardt, Björn W. Schuller

Machine learning models for speech emotion recognition (SER) can be trained for different tasks and are usually evaluated on the basis of a few available datasets per task.

Fairness Speech Emotion Recognition

STAA-Net: A Sparse and Transferable Adversarial Attack for Speech Emotion Recognition

no code implementations2 Feb 2024 Yi Chang, Zhao Ren, Zixing Zhang, Xin Jing, Kun Qian, Xi Shao, Bin Hu, Tanja Schultz, Björn W. Schuller

Speech contains rich information on the emotions of humans, and Speech Emotion Recognition (SER) has been an important topic in the area of human-computer interaction.

Adversarial Attack Speech Emotion Recognition

On Prompt Sensitivity of ChatGPT in Affective Computing

no code implementations20 Mar 2024 Mostafa M. Amin, Björn W. Schuller

Recent studies have demonstrated the emerging capabilities of foundation models like ChatGPT in several fields, including affective computing.

Prompt Engineering Sarcasm Detection +2

Enhancing Suicide Risk Assessment: A Speech-Based Automated Approach in Emergency Medicine

no code implementations18 Apr 2024 Shahin Amiriparian, Maurice Gerczuk, Justina Lutz, Wolfgang Strube, Irina Papazova, Alkomiet Hasan, Alexander Kathan, Björn W. Schuller

The metadata integration yields a balanced accuracy of $94. 4\,\%$, marking an absolute improvement of $28. 2\,\%$, demonstrating the efficacy of our proposed approaches for automatic suicide risk assessment in emergency medicine.

Binary Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.