Semi-supervised Intent Discovery with Contrastive Learning

User intent discovery is a key step in developing a Natural Language Understanding (NLU) module at the core of any modern Conversational AI system. Typically, human experts review a representative sample of user input data to discover new intents, which is subjective, costly, and error-prone. In this work, we aim to assist the NLU developers by presenting a novel method for discovering new intents at scale given a corpus of utterances. Our method utilizes supervised contrastive learning to leverage information from a domain-relevant, already labeled dataset and identifies new intents in the corpus at hand using unsupervised K-means clustering. Our method outperforms the state-of-the-art by a large margin up to 2% and 13% on two benchmark datasets, measured by clustering accuracy. Furthermore, we apply our method on a large dataset from the travel domain to demonstrate its effectiveness on a real-world use case.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods