1 code implementation • 23 Nov 2022 • Hugo Carneiro, Cornelius Weber, Stefan Wermter
Finally, we devise a model for emotion recognition in conversations trained on the realigned MELD-FAIR videos, which outperforms state-of-the-art models for ERC based on vision alone.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 2 Nov 2021 • Di Fu, Fares Abawi, Hugo Carneiro, Matthias Kerzel, Ziwei Chen, Erik Strahl, Xun Liu, Stefan Wermter
Our saliency prediction model was trained to detect social cues, predict audio-visual saliency, and attend selectively for the robot study.
no code implementations • 1 Sep 2021 • Hugo Carneiro, Cornelius Weber, Stefan Wermter
The strong relation between face and voice can aid active speaker detection systems when faces are visible, even in difficult settings, when the face of a speaker is not clear or when there are several people in the same scene.