no code implementations • EMNLP (LaTeCHCLfL, CLFL, LaTeCH) 2021 • Maria Kunilovskaya, Ekaterina Lapshinova-Koltunski, Ruslan Mitkov
We expect that literary translations from typologically distant languages should exhibit more translationese, and the fingerprints of individual source languages (and their families) are traceable in translations.
no code implementations • RANLP 2021 • Maria Kunilovskaya, Alistair Plum
This paper focuses on data cleaning as part of a preprocessing procedure applied to text data retrieved from the web.
no code implementations • RANLP 2021 • Maria Kunilovskaya, Ekaterina Lapshinova-Koltunski, Ruslan Mitkov
The texts are represented with frequency-based features that capture structural and lexical properties of language.
no code implementations • LREC 2020 • Maria Kunilovskaya, Ekaterina Lapshinova-Koltunski
This research employs genre-comparable data from a number of parallel and comparable corpora to explore the specificity of translations from English into German and Russian produced by students and professional translators.
no code implementations • RANLP 2019 • Maria Kunilovskaya, Ekaterina Lapshinova-Koltunski
We use a range of morpho-syntactic features inspired by research in register studies (e. g. Biber, 1995; Neumann, 2013) and translation studies (e. g. Ilisei et al., 2010; Zanettin, 2013; Kunilovskaya and Kutuzov, 2018) to reveal the association between translationese and human translation quality.
no code implementations • RANLP 2019 • Maria Kunilovskaya, Serge Sharoff
We exploit a text-external approach, based on a set of Functional Text Dimensions to model text functions, so that each text can be represented as a vector in a multidimensional space of text functions.
1 code implementation • 19 Jan 2018 • Andrey Kutuzov, Maria Kunilovskaya
Aside from the already known fact that the RNC is generally a better training corpus than web corpora, we enumerate and explain fine differences in how the models process semantic similarity task, what parts of the evaluation set are difficult for particular models and why.