Sentence Similarity Learning by Lexical Decomposition and Composition

COLING 2016  ·  Zhiguo Wang, Haitao Mi, Abraham Ittycheriah ·

Most conventional sentence similarity methods only focus on similar parts of two input sentences, and simply ignore the dissimilar parts, which usually give us some clues and semantic meanings about the sentences. In this work, we propose a model to take into account both the similarities and dissimilarities by decomposing and composing lexical semantics over sentences. The model represents each word as a vector, and calculates a semantic matching vector for each word based on all words in the other sentence. Then, each word vector is decomposed into a similar component and a dissimilar component based on the semantic matching vector. After this, a two-channel CNN model is employed to capture features by composing the similar and dissimilar components. Finally, a similarity score is estimated over the composed feature vectors. Experimental results show that our model gets the state-of-the-art performance on the answer sentence selection task, and achieves a comparable result on the paraphrase identification task.

PDF Abstract COLING 2016 PDF COLING 2016 Abstract

Datasets


Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Question Answering WikiQA LDC MAP 0.7058 # 13
MRR 0.7226 # 13

Methods


No methods listed for this paper. Add relevant methods here