Search Results for author: Andrey Kutuzov

Found 38 papers, 10 papers with code

Multilingual ELMo and the Effects of Corpus Sampling

no code implementations NoDaLiDa 2021 Vinit Ravishankar, Andrey Kutuzov, Lilja Øvrelid, Erik Velldal

Multilingual pretrained language models are rapidly gaining popularity in NLP systems for non-English languages.

Pretrained Language Models

Do Not Fire the Linguist: Grammatical Profiles Help Language Models Detect Semantic Change

no code implementations LChange (ACL) 2022 Mario Giulianelli, Andrey Kutuzov, Lidia Pivovarova

In this work, we explore whether large pre-trained contextualised language models, a common tool for lexical semantic change detection, are sensitive to such morphosyntactic changes.

Change Detection Language Modelling

Large-Scale Contextualised Language Modelling for Norwegian

2 code implementations NoDaLiDa 2021 Andrey Kutuzov, Jeremy Barnes, Erik Velldal, Lilja Øvrelid, Stephan Oepen

We present the ongoing NorLM initiative to support the creation and use of very large contextualised language models for Norwegian (and in principle other Nordic languages), including a ready-to-use software environment, as well as an experience report for data preparation and training.

Language Modelling

Representing ELMo embeddings as two-dimensional text online

no code implementations EACL 2021 Andrey Kutuzov, Elizaveta Kuzmenko

We describe a new addition to the WebVectors toolkit which is used to serve word embedding models over the Web.

RuSemShift: a dataset of historical lexical semantic change in Russian

no code implementations COLING 2020 Julia Rodina, Andrey Kutuzov

We present RuSemShift, a large-scale manually annotated test set for the task of semantic change modeling in Russian for two long-term time period pairs: from the pre-Soviet through the Soviet times and from the Soviet through the post-Soviet times.

ELMo and BERT in semantic change detection for Russian

no code implementations7 Oct 2020 Julia Rodina, Yuliya Trofimova, Andrey Kutuzov, Ekaterina Artemova

We study the effectiveness of contextualized embeddings for the task of diachronic semantic change detection for Russian language data.

Change Detection

Word Sense Disambiguation for 158 Languages using Word Embeddings Only

no code implementations LREC 2020 Varvara Logacheva, Denis Teslenko, Artem Shelmanov, Steffen Remus, Dmitry Ustalov, Andrey Kutuzov, Ekaterina Artemova, Chris Biemann, Simone Paolo Ponzetto, Alexander Panchenko

We use this method to induce a collection of sense inventories for 158 languages on the basis of the original pre-trained fastText word embeddings by Grave et al. (2018), enabling WSD in these languages.

Word Embeddings Word Sense Disambiguation

\'UFAL-Oslo at MRP 2019: Garage Sale Semantic Parsing

no code implementations CONLL 2019 Kira Droganova, Andrey Kutuzov, Nikita Mediankin, Daniel Zeman

This paper describes the {\'U}FAL--Oslo system submission to the shared task on Cross-Framework Meaning Representation Parsing (MRP, Oepen et al. 2019).

Semantic Parsing

One-to-X analogical reasoning on word embeddings: a case for diachronic armed conflict prediction from news texts

1 code implementation WS 2019 Andrey Kutuzov, Erik Velldal, Lilja Øvrelid

We extend the well-known word analogy task to a one-to-X formulation, including one-to-none cases, when no correct answer exists.

Word Embeddings

Diachronic word embeddings and semantic shifts: a survey

no code implementations COLING 2018 Andrey Kutuzov, Lilja Øvrelid, Terrence Szymanski, Erik Velldal

Recent years have witnessed a surge of publications aimed at tracing temporal changes in lexical semantics using distributional methods, particularly prediction-based word embedding models.

Diachronic Word Embeddings Natural Language Processing +1

Russian word sense induction by clustering averaged word embeddings

1 code implementation6 May 2018 Andrey Kutuzov

The paper reports our participation in the shared task on word sense induction and disambiguation for the Russian language (RUSSE-2018).

Word Embeddings Word Sense Induction

Size vs. Structure in Training Corpora for Word Embedding Models: Araneum Russicum Maximum and Russian National Corpus

1 code implementation19 Jan 2018 Andrey Kutuzov, Maria Kunilovskaya

Aside from the already known fact that the RNC is generally a better training corpus than web corpora, we enumerate and explain fine differences in how the models process semantic similarity task, what parts of the evaluation set are difficult for particular models and why.

Semantic Similarity Semantic Textual Similarity

Tracing armed conflicts with diachronic word embedding models

no code implementations WS 2017 Andrey Kutuzov, Erik Velldal, Lilja {\O}vrelid

Recent studies have shown that word embedding models can be used to trace time-related (diachronic) semantic shifts in particular words.

Word Embeddings

Redefining Context Windows for Word Embedding Models: An Experimental Study

no code implementations WS 2017 Pierre Lison, Andrey Kutuzov

Distributional semantic models learn vector representations of words through the contexts they occur in.

Building Web-Interfaces for Vector Semantic Models with the WebVectors Toolkit

no code implementations EACL 2017 Andrey Kutuzov, Elizaveta Kuzmenko

In this demo we present WebVectors, a free and open-source toolkit helping to deploy web services which demonstrate and visualize distributional semantic models (widely known as word embeddings).

Machine Translation Named Entity Recognition +2

Exploration of register-dependent lexical semantics using word embeddings

1 code implementation WS 2016 Andrey Kutuzov, Elizaveta Kuzmenko, Anna Marakasova

We present an approach to detect differences in lexical semantics across English language registers, using word embedding models from distributional semantics paradigm.

General Classification Word Embeddings

Redefining part-of-speech classes with distributional semantic models

no code implementations CONLL 2016 Andrey Kutuzov, Erik Velldal, Lilja Øvrelid

This paper studies how word embeddings trained on the British National Corpus interact with part of speech boundaries.

POS TAG +1

Neural Embedding Language Models in Semantic Clustering of Web Search Results

no code implementations LREC 2016 Andrey Kutuzov, Elizaveta Kuzmenko

In this paper, a new approach towards semantic clustering of the results of ambiguous search queries is presented.

Clustering Comparable Corpora of Russian and Ukrainian Academic Texts: Word Embeddings and Semantic Fingerprints

no code implementations18 Apr 2016 Andrey Kutuzov, Mikhail Kopotev, Tatyana Sviridenko, Lyubov Ivanova

We present our experience in applying distributional semantics (neural word embeddings) to the problem of representing and clustering documents in a bilingual comparable corpus.

Translation Word Embeddings

Texts in, meaning out: neural language models in semantic similarity task for Russian

no code implementations30 Apr 2015 Andrey Kutuzov, Igor Andreev

Distributed vector representations for natural language vocabulary get a lot of attention in contemporary computational linguistics.

Semantic Similarity Semantic Textual Similarity

Cannot find the paper you are looking for? You can Submit a new open access paper.