Search Results for author: Li-Hsin Chang

Found 10 papers, 2 papers with code

Fine-grained Named Entity Annotation for Finnish

no code implementations NoDaLiDa 2021 Jouni Luoma, Li-Hsin Chang, Filip Ginter, Sampo Pyysalo

We introduce a corpus with fine-grained named entity annotation for Finnish, following the OntoNotes guidelines to create a resource that is cross-lingually compatible with existing annotations for other languages.

NER

Towards Automatic Short Answer Assessment for Finnish as a Paraphrase Retrieval Task

no code implementations NAACL (BEA) 2022 Li-Hsin Chang, Jenna Kanerva, Filip Ginter

Automatic grouping of textual answers has the potential of allowing batch grading, but is challenging because the answers, especially longer essays, have many claims.

Paraphrase Identification Retrieval +2

TallVocabL2Fi: A Tall Dataset of 15 Finnish L2 Learners’ Vocabulary

no code implementations LREC 2022 Frankie Robertson, Li-Hsin Chang, Sini Söyrinki

Previous work concerning measurement of second language learners has tended to focus on the knowledge of small numbers of words, often geared towards measuring vocabulary size.

Towards better structured and less noisy Web data: Oscar with Register annotations

no code implementations COLING (WNUT) 2022 Veronika Laippala, Anna Salmela, Samuel Rönnqvist, Alham Fikri Aji, Li-Hsin Chang, Asma Dhifallah, Larissa Goulart, Henna Kortelainen, Marc Pàmies, Deise Prina Dutra, Valtteri Skantsi, Lintang Sutawika, Sampo Pyysalo

Web-crawled datasets are known to be noisy, as they feature a wide range of language use covering both user-generated and professionally edited content as well as noise originating from the crawling process.

Semantic Search as Extractive Paraphrase Span Detection

1 code implementation9 Dec 2021 Jenna Kanerva, Hanna Kitti, Li-Hsin Chang, Teemu Vahtola, Mathias Creutz, Filip Ginter

In this paper, we approach the problem of semantic search by framing the search task as paraphrase span detection, i. e. given a segment of text as a query phrase, the task is to identify its paraphrase in a given document, the same modelling setup as typically used in extractive question answering.

Extractive Question-Answering Question Answering +5

Quantitative Evaluation of Alternative Translations in a Corpus of Highly Dissimilar Finnish Paraphrases

no code implementations MoTra (NoDaLiDa) 2021 Li-Hsin Chang, Sampo Pyysalo, Jenna Kanerva, Filip Ginter

In this paper, we present a quantitative evaluation of differences between alternative translations in a large recently released Finnish paraphrase corpus focusing in particular on non-trivial variation in translation.

Translation

Deep learning for sentence clustering in essay grading support

no code implementations23 Apr 2021 Li-Hsin Chang, Iiro Rastas, Sampo Pyysalo, Filip Ginter

Essays as a form of assessment test student knowledge on a deeper level than short answer and multiple-choice questions.

Clustering Deep Learning +2

Finnish Paraphrase Corpus

1 code implementation NoDaLiDa 2021 Jenna Kanerva, Filip Ginter, Li-Hsin Chang, Iiro Rastas, Valtteri Skantsi, Jemina Kilpeläinen, Hanna-Mari Kupari, Jenna Saarni, Maija Sevón, Otto Tarkka

Out of all paraphrase pairs in our corpus 98% are manually classified to be paraphrases at least in their given context, if not in all contexts.

Towards Fully Bilingual Deep Language Modeling

no code implementations22 Oct 2020 Li-Hsin Chang, Sampo Pyysalo, Jenna Kanerva, Filip Ginter

Language models based on deep neural networks have facilitated great advances in natural language processing and understanding tasks in recent years.

Cross-Lingual Transfer Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.