no code implementations • LChange (ACL) 2022 • Iiro Rastas, Yann Ciarán Ryan, Iiro Tiihonen, Mohammadreza Qaraei, Liina Repo, Rohit Babbar, Eetu Mäkelä, Mikko Tolonen, Filip Ginter
In this paper, we describe a BERT model trained on the Eighteenth Century Collections Online (ECCO) dataset of digitized documents.
Optical Character Recognition Optical Character Recognition (OCR)
no code implementations • 17 Aug 2021 • Jenna Kanerva, Filip Ginter, Li-Hsin Chang, Iiro Rastas, Valtteri Skantsi, Jemina Kilpeläinen, Hanna-Mari Kupari, Aurora Piirto, Jenna Saarni, Maija Sevón, Otto Tarkka
This document describes the annotation guidelines used to construct the Turku Paraphrase Corpus.
no code implementations • 23 Apr 2021 • Li-Hsin Chang, Iiro Rastas, Sampo Pyysalo, Filip Ginter
Essays as a form of assessment test student knowledge on a deeper level than short answer and multiple-choice questions.
1 code implementation • NoDaLiDa 2021 • Jenna Kanerva, Filip Ginter, Li-Hsin Chang, Iiro Rastas, Valtteri Skantsi, Jemina Kilpeläinen, Hanna-Mari Kupari, Jenna Saarni, Maija Sevón, Otto Tarkka
Out of all paraphrase pairs in our corpus 98% are manually classified to be paraphrases at least in their given context, if not in all contexts.