Dialogue Act Sequence Labeling using Hierarchical encoder with CRF

13 Sep 2017  ·  Harshit Kumar, Arvind Agarwal, Riddhiman Dasgupta, Sachindra Joshi, Arun Kumar ·

Dialogue Act recognition associate dialogue acts (i.e., semantic labels) to utterances in a conversation. The problem of associating semantic labels to utterances can be treated as a sequence labeling problem. In this work, we build a hierarchical recurrent neural network using bidirectional LSTM as a base unit and the conditional random field (CRF) as the top layer to classify each utterance into its corresponding dialogue act. The hierarchical network learns representations at multiple levels, i.e., word level, utterance level, and conversation level. The conversation level representations are input to the CRF layer, which takes into account not only all previous utterances but also their dialogue acts, thus modeling the dependency among both, labels and utterances, an important consideration of natural dialogue. We validate our approach on two different benchmark data sets, Switchboard and Meeting Recorder Dialogue Act, and show performance improvement over the state-of-the-art methods by $2.2\%$ and $4.1\%$ absolute points, respectively. It is worth noting that the inter-annotator agreement on Switchboard data set is $84\%$, and our method is able to achieve the accuracy of about $79\%$ despite being trained on the noisy data.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Dialogue Act Classification ICSI Meeting Recorder Dialog Act (MRDA) corpus Bi-LSTM-CRF Accuracy 90.9 # 6
Dialogue Act Classification Switchboard corpus Bi-LSTM-CRF Accuracy 79.2 # 7