Data Augmentation by Data Noising for Open-vocabulary Slots in Spoken Language Understanding

NAACL 2019  ·  Hwa-Yeon Kim, Yoon-Hyung Roh, Young-Kil Kim ·

One of the main challenges in Spoken Language Understanding (SLU) is dealing with {`}open-vocabulary{'} slots. Recently, SLU models based on neural network were proposed, but it is still difficult to recognize the slots of unknown words or {`}open-vocabulary{'} slots because of the high cost of creating a manually tagged SLU dataset. This paper proposes data noising, which reflects the characteristics of the {`}open-vocabulary{'} slots, for data augmentation. We applied it to an attention based bi-directional recurrent neural network (Liu and Lane, 2016) and experimented with three datasets: Airline Travel Information System (ATIS), Snips, and MIT-Restaurant. We achieved performance improvements of up to 0.57{\%} and 3.25 in intent prediction (accuracy) and slot filling (f1-score), respectively. Our method is advantageous because it does not require additional memory and it can be applied simultaneously with the training process of the model.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here