MITRE at SemEval-2017 Task 1: Simple Semantic Similarity

SEMEVAL 2017 · John Henderson, Elizabeth Merkhofer, Laura Strickhart, Guido Zarrella ·

This paper describes MITRE{'}s participation in the Semantic Textual Similarity task (SemEval-2017 Task 1), which evaluated machine learning approaches to the identification of similar meaning among text snippets in English, Arabic, Spanish, and Turkish. We detail the techniques we explored ranging from simple bag-of-ngrams classifiers to neural architectures with varied attention and alignment mechanisms. Linear regression is used to tie the systems together into an ensemble submitted for evaluation. The resulting system is capable of matching human similarity ratings of image captions with correlations of 0.73 to 0.83 in monolingual settings and 0.68 to 0.78 in cross-lingual conditions, demonstrating the power of relatively simple approaches.

PDF Abstract