BL.Research at SemEval-2022 Task 8: Using various Semantic Information to evaluate document-level Semantic Textual Similarity

SemEval (NAACL) 2022 · Sebastien Dufour, Mohamed Mehdi Kandi, Karim Boutamine, Camille Gosse, Mokhtar Boumedyen Billami, Christophe Bortolaso, Youssef Miloudi ·

This paper presents our system for document-level semantic textual similarity (STS) evaluation at SemEval-2022 Task 8: “Multilingual News Article Similarity”. The semantic information used is obtained by using different semantic models ranging from the extraction of key terms and named entities to the document classification and obtaining similarity from automatic summarization of documents. All these semantic information’s are then used as features to feed a supervised system in order to evaluate the degree of similarity of a pair of documents. We obtained a Pearson correlation score of 0.706 compared to the best score of 0.818 from teams that participated in this task.

PDF Abstract