Dimensional emotion recognition using visual and textual cues

3 May 2018  ·  Pedro M. Ferreira, Diogo Pernes, Kelwin Fernandes, Ana Rebelo, Jaime S. Cardoso ·

This paper addresses the problem of automatic emotion recognition in the scope of the One-Minute Gradual-Emotional Behavior challenge (OMG-Emotion challenge). The underlying objective of the challenge is the automatic estimation of emotion expressions in the two-dimensional emotion representation space (i.e., arousal and valence). The adopted methodology is a weighted ensemble of several models from both video and text modalities. For video-based recognition, two different types of visual cues (i.e., face and facial landmarks) were considered to feed a multi-input deep neural network. Regarding the text modality, a sequential model based on a simple recurrent architecture was implemented. In addition, we also introduce a model based on high-level features in order to embed domain knowledge in the learning process. Experimental results on the OMG-Emotion validation set demonstrate the effectiveness of the implemented ensemble model as it clearly outperforms the current baseline methods.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here