no code implementations • LREC 2022 • Dimitrios Roussis, Vassilis Papavassiliou, Sokratis Sofianopoulos, Prokopis Prokopidis, Stelios Piperidis
This paper presents a collection of parallel corpora generated by exploiting the COVID-19 related dataset of metadata created with the Europe Media Monitor (EMM) / Medical Information System (MediSys) processing chain of news articles.
no code implementations • WS 2018 • Vassilis Papavassiliou, Sokratis Sofianopoulos, Prokopis Prokopidis, Stelios Piperidis
We also discuss alternative methods for ranking the sentence pairs of the most appropriate clusters with the aim of generating the two datasets (of 10 and 100 million words as required in the task) that were evaluated.