no code implementations • NAACL (BEA) 2022 • Judit Casademont Moner, Elena Volodina
We report on our work-in-progress to generate a synthetic error dataset for Swedish by replicating errors observed in the authentic error annotated dataset.
no code implementations • WS (NoDaLiDa) 2019 • David Alfter, Therese Lindström Tiedemann, Elena Volodina
This article is a report from an ongoing project aiming at analyzing lexical and grammatical competences of Swedish as a Second language (L2).
no code implementations • NoDaLiDa 2021 • Elena Volodina, Yousuf Ali Mohammed, Therese Lindström Tiedemann
The paper introduces a new resource, CoDeRooMor, for studying the morphology of modern Swedish word formation.
no code implementations • 30 Aug 2023 • Elena Volodina, Simon Dobnik, Therese Lindström Tiedemann, Xuan-Son Vu
Accessibility of research data is critical for advances in many research fields, but textual data often cannot be shared due to the personal and sensitive information which it contains, e. g names or political opinions.
no code implementations • 17 Jun 2022 • David Alfter, Therese Lindström Tiedemann, Elena Volodina
In this study we investigate to which degree experts and non-experts agree on questions of difficulty in a crowdsourcing experiment.
no code implementations • 14 May 2021 • Elena Volodina, Yousuf Ali Mohammed, Julia Klezl
We present DaLAJ 1. 0, a Dataset for Linguistic Acceptability Judgments for Swedish, comprising 9 596 sentences in its first version; and the initial experiment using it for the binary classification task.
no code implementations • COLING 2020 • Elena Volodina, Yousuf Ali Mohammed, Sandra Derbring, Arild Matsson, Beata Megyesi
The process includes three steps: identification of personal information in an unstructured text, labeling for a category, and pseudonymization.
no code implementations • WS 2018 • Be{\'a}ta Megyesi, Lena Granstedt, Sofia Johansson, Julia Prentice, Dan Ros{\'e}n, Carl-Johan Schenstr{\"o}m, Gunl{\"o}g Sundberg, Mats Wir{\'e}n, Elena Volodina
no code implementations • WS 2018 • Ildik{\'o} Pil{\'a}n, Elena Volodina
We present the results of our investigations aiming at identifying the most informative linguistic complexity features for classifying language learning levels in three different datasets.
no code implementations • COLING 2018 • Ildik{\'o} Pil{\'a}n, Elena Volodina
The presence of misspellings and other errors or non-standard word forms poses a considerable challenge for NLP systems.
no code implementations • WS 2018 • David Alfter, Elena Volodina
In this paper we present work-in-progress where we investigate the usefulness of previously created word lists to the task of single-word lexical complexity analysis and prediction of the complexity level for learners of Swedish as a second language.
no code implementations • 12 Jun 2017 • Ildikó Pilán, Elena Volodina, Lars Borin
We present a framework and its implementation relying on Natural Language Processing methods, which aims at the identification of exercise item candidates from corpora.
no code implementations • WS 2016 • Ildik{\'o} Pil{\'a}n, David Alfter, Elena Volodina
We bring together knowledge from two different types of language learning data, texts learners read and texts they write, to improve linguistic complexity classification in the latter.
no code implementations • COLING 2016 • Ildik{\'o} Pil{\'a}n, Elena Volodina, Torsten Zesch
The lack of a sufficient amount of data tailored for a task is a well-recognized problem for many statistical NLP methods.
no code implementations • LREC 2016 • Thomas Fran{\c{c}}ois, Elena Volodina, Ildik{\'o} Pil{\'a}n, Ana{\"\i}s Tack
The paper introduces SVALex, a lexical resource primarily aimed at learners and teachers of Swedish as a foreign and second language that describes the distribution of 15, 681 words and expressions across the Common European Framework of Reference (CEFR).
1 code implementation • LREC 2016 • Elena Volodina, Ildikó Pilán, Ingegerd Enström, Lorena Llozhi, Peter Lundkvist, Gunlög Sundberg, Monica Sandell
Inter-rater agreement is presented on the basis of SW1203 subcorpus.
no code implementations • 29 Mar 2016 • Ildikó Pilán, Sowmya Vajjala, Elena Volodina
Corpora and web texts can become a rich language learning resource if we have a means of assessing whether they are linguistically appropriate for learners at a given proficiency level.
no code implementations • LREC 2014 • Ildik{\'o} Pil{\'a}n, Elena Volodina
In this article we present the first experiences of reusing the Swedish FrameNet (SweFN) as a resource for training semantic roles.
no code implementations • LREC 2014 • Elena Volodina, Ildik{\'o} Pil{\'a}n, Lars Borin, Therese Lindstr{\"o}m Tiedemann
We present L{\"a}rka, the language learning platform of Spr{\"a}kbanken (the Swedish Language Bank).
no code implementations • LREC 2012 • Elena Volodina, Sofie Johansson Kokkinakis
We provide a short description of the KELLY project; examine the methodological approach and mention some details on the compiling of the corpus from which the list has been derived.