LIM-LIG at SemEval-2017 Task1: Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting

SEMEVAL 2017 · El Moatez Billah Nagoudi, J{\'e}r{\'e}my Ferrero, Didier Schwab ·

This article describes our proposed system named LIM-LIG. This system is designed for SemEval 2017 Task1: Semantic Textual Similarity (Track1). LIM-LIG proposes an innovative enhancement to word embedding-based model devoted to measure the semantic similarity in Arabic sentences. The main idea is to exploit the word representations as vectors in a multidimensional space to capture the semantic and syntactic properties of words. IDF weighting and Part-of-Speech tagging are applied on the examined sentences to support the identification of words that are highly descriptive in each sentence. LIM-LIG system achieves a Pearson{'}s correlation of 0.74633, ranking 2nd among all participants in the Arabic monolingual pairs STS task organized within the SemEval 2017 evaluation campaign

PDF Abstract