no code implementations • 15 Feb 2018 • Caroline Etienne, Guillaume Fidanza, Andrei Petrovskii, Laurence Devillers, Benoit Schmauch
Following the latest advances in audio analysis, we use an architecture involving both convolutional layers, for extracting high-level features from raw spectrograms, and recurrent ones for aggregating long-term dependencies.
Ranked #6 on Speech Emotion Recognition on IEMOCAP (UA metric)