Hierarchical Pre-training for Sequence Labelling in Spoken Dialog

Sequence labelling tasks like Dialog Act and Emotion/Sentiment identification are a key component of spoken dialog systems. In this work, we propose a new approach to learn generic representations adapted to spoken dialog, which we evaluate on a new benchmark we call Sequence labellIng evaLuatIon benChmark fOr spoken laNguagE benchmark (\texttt{SILICONE})... \texttt{SILICONE} is model-agnostic and contains 10 different datasets of various sizes. We obtain our representations with a hierarchical encoder based on transformer architectures, for which we extend two well-known pre-training objectives. Pre-training is performed on OpenSubtitles: a large corpus of spoken dialog containing over $2.3$ billion of tokens. We demonstrate how hierarchical encoders achieve competitive results with consistently fewer parameters compared to state-of-the-art models and we show their importance for both pre-training and fine-tuning. read more

PDF Abstract Findings of the Association for Computational Linguistics 2020 PDF Findings of the Association for Computational Linguistics 2020 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Emotion Recognition in Conversation DailyDialog Pretrained Hierarchical Transformer Micro-F1 60.14 # 2
Dialogue Act Classification ICSI Meeting Recorder Dialog Act (MRDA) corpus Pretrained Hierarchical Transformer Accuracy 92.4 # 1
Emotion Recognition in Conversation IEMOCAP Pretrained Hierarchical Transformer Weighted-F1 65.30 # 6
Emotion Recognition in Conversation MELD Pretrained Hierarchical Transformer Weighted-F1 61.90 # 6
Emotion Recognition in Conversation SEMAINE Pretrained Hierarchical Transformer MAE (Valence) 0.16 # 2
MAE (Arousal) 0.16 # 2
MAE (Expectancy) 0.16 # 1
MAE (Power) 7.70 # 2
Text Classification SILICONE Benchmark Pretrained Hierarchical Transformer 1:1 Accuracy 71.25 # 1
Dialogue Act Classification Switchboard corpus Pretrained Hierarchical Transformer Accuracy 79.2 # 6

Methods


No methods listed for this paper. Add relevant methods here