Speech Emotion Recognition
100 papers with code • 14 benchmarks • 18 datasets
Speech Emotion Recognition is a task of speech processing and computational paralinguistics that aims to recognize and categorize the emotions expressed in spoken language. The goal is to determine the emotional state of a speaker, such as happiness, anger, sadness, or frustration, from their speech patterns, such as prosody, pitch, and rhythm.
For multimodal emotion recognition, please upload your result to Multimodal Emotion Recognition on IEMOCAP
Libraries
Use these libraries to find Speech Emotion Recognition models and implementationsSubtasks
Most implemented papers
Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio Analysis
The field of Text-to-Speech has experienced huge improvements last years benefiting from deep learning techniques.
Attention-Augmented End-to-End Multi-Task Learning for Emotion Prediction from Speech
Despite the increasing research interest in end-to-end learning systems for speech emotion recognition, conventional systems either suffer from the overfitting due in part to the limited training data, or do not explicitly consider the different contributions of automatically learnt representations for a specific task.
An Interaction-aware Attention Network for Speech Emotion Recognition in Spoken Dialogs
In this work, we propose an interaction-aware attention network (IAAN) that incorporate contextual information in the learned vocal representation through a novel attention mechanism.
Learning Alignment for Multimodal Emotion Recognition from Speech
Further, emotion recognition will be beneficial from using audio-textual multimodal information, it is not trivial to build a system to learn from multimodality.
Speech Emotion Recognition Using Speech Feature and Word Embedding
Text features can be combined with speech features to improve emotion recognition accuracy, and both features can be obtained from speech.
Attentive Modality Hopping Mechanism for Speech Emotion Recognition
In this work, we explore the impact of visual modality in addition to speech and text for improving the accuracy of the emotion detection system.
Non-linear Neurons with Human-like Apical Dendrite Activations
In order to classify linearly non-separable data, neurons are typically organized into multi-layer neural networks that are equipped with at least one hidden layer.
Speech emotion recognition with deep convolutional neural networks
The speech emotion recognition (or, classification) is one of the most challenging topics in data science.
Evaluation of Error and Correlation-Based Loss Functions For Multitask Learning Dimensional Speech Emotion Recognition
The choice of a loss function is a critical part in machine learning.
On The Differences Between Song and Speech Emotion Recognition: Effect of Feature Sets, Feature Types, and Classifiers
In this paper, we evaluate the different features sets, feature types, and classifiers on both song and speech emotion recognition.