Search Results for author: Nikolay Arefyev

Found 27 papers, 7 papers with code

NB-MLM: Efficient Domain Adaptation of Masked Language Models for Sentiment Analysis

1 code implementation EMNLP 2021 Nikolay Arefyev, Dmitrii Kharchev, Artem Shelmanov

While Masked Language Models (MLM) are pre-trained on massive datasets, the additional training with the MLM objective on domain or task-specific data before fine-tuning for the final task is known to improve the final performance.

Domain Adaptation Sentiment Analysis

black[LSCDiscovery shared task] GlossReader at LSCDiscovery: Train to Select a Proper Gloss in English – Discover Lexical Semantic Change in Spanish

no code implementations LChange (ACL) 2022 Maxim Rachinskiy, Nikolay Arefyev

In order to conclude if there are any differences between senses of a particular word in two corpora, a human annotator or a system shall analyze many examples containing this word from both corpora.

Change Detection XLM-R

Enriching Word Usage Graphs with Cluster Definitions

1 code implementation26 Mar 2024 Mariia Fedorova, Andrey Kutuzov, Nikolay Arefyev, Dominik Schlechtweg

We present a dataset of word usage graphs (WUGs), where the existing WUGs for multiple languages are enriched with cluster labels functioning as sense definitions.

A New Massive Multilingual Dataset for High-Performance Language Technologies

no code implementations20 Mar 2024 Ona de Gibert, Graeme Nail, Nikolay Arefyev, Marta Bañón, Jelmer Van der Linde, Shaoxiong Ji, Jaume Zaragoza-Bernabeu, Mikko Aulamo, Gema Ramírez-Sánchez, Andrey Kutuzov, Sampo Pyysalo, Stephan Oepen, Jörg Tiedemann

We present the HPLT (High Performance Language Technologies) language resources, a new massive multilingual dataset including both monolingual and bilingual corpora extracted from CommonCrawl and previously unused web crawls from the Internet Archive.

Language Modelling Machine Translation +2

BOS at LSCDiscovery: Lexical Substitution for Interpretable Lexical Semantic Change Detection

no code implementations7 Jun 2022 Artem Kudisov, Nikolay Arefyev

We propose a solution for the LSCDiscovery shared task on Lexical Semantic Change Detection in Spanish.

Change Detection

Always Keep your Target in Mind: Studying Semantics and Improving Performance of Neural Lexical Substitution

1 code implementation COLING 2020 Nikolay Arefyev, Boris Sheludko, Alexander Podolskiy, Alexander Panchenko

Lexical substitution, i. e. generation of plausible words that can replace a particular target word in a given context, is an extremely powerful technology that can be used as a backbone of various NLP applications, including word sense induction and disambiguation, lexical relation extraction, data augmentation, etc.

Data Augmentation Relation Extraction +1

The Document Vectors Using Cosine Similarity Revisited

1 code implementation insights (ACL) 2022 Zhang Bingyu, Nikolay Arefyev

The results show that while RoBERTa has a clear advantage for larger training sets, the DV-ngrams-cosine performs better than RoBERTa when the labelled training set is very small (10 or 20 documents).

Sentiment Analysis

GlossReader at SemEval-2021 Task 2: Reading Definitions Improves Contextualized Word Embeddings

no code implementations SEMEVAL 2021 Maxim Rachinskiy, Nikolay Arefyev

To verify this hypothesis we developed a solution for the Multilingual and Cross-lingual Word-in-Context (MCL-WiC) task, that does not use any of the shared task data or other WiC data for training.

Task 2 Word Embeddings +3

LIORI at SemEval-2021 Task 8: Ask Transformer for measurements

no code implementations SEMEVAL 2021 Adis Davletov, Denis Gordeev, Nikolay Arefyev, Emil Davletov

This work describes our approach for subtasks of SemEval-2021 Task 8: MeasEval: Counts and Measurements which took the official first place in the competition.

Multi-Task Learning Question Answering

Combining Neural Language Models for WordSense Induction

no code implementations23 Jun 2020 Nikolay Arefyev, Boris Sheludko, Tatiana Aleksashina

Word sense induction (WSI) is the problem of grouping occurrences of an ambiguous word according to the expressed sense of this word.

Word Sense Induction

A Comparative Study of Lexical Substitution Approaches based on Neural Language Models

no code implementations29 May 2020 Nikolay Arefyev, Boris Sheludko, Alexander Podolskiy, Alexander Panchenko

Lexical substitution in context is an extremely powerful technology that can be used as a backbone of various NLP applications, such as word sense induction, lexical relation extraction, data augmentation, etc.

Data Augmentation Relation Extraction +1

Cross-lingual Named Entity List Search via Transliteration

no code implementations LREC 2020 Aleks Khakhmovich, R, Svetlana Pavlova, Kira Kirillova, Nikolay Arefyev, Ekaterina Savilova

Out-of-vocabulary words are still a challenge in cross-lingual Natural Language Processing tasks, for which transliteration from source to target language or script is one of the solutions.

Transliteration

Combining Lexical Substitutes in Neural Word Sense Induction

no code implementations RANLP 2019 Nikolay Arefyev, Boris Sheludko, Alex Panchenko, er

Word Sense Induction (WSI) is the task of grouping of occurrences of an ambiguous word according to their meaning.

Clustering Word Sense Induction

HHMM at SemEval-2019 Task 2: Unsupervised Frame Induction using Contextualized Word Embeddings

1 code implementation SEMEVAL 2019 Saba Anwar, Dmitry Ustalov, Nikolay Arefyev, Simone Paolo Ponzetto, Chris Biemann, Alexander Panchenko

We present our system for semantic frame induction that showed the best performance in Subtask B. 1 and finished as the runner-up in Subtask A of the SemEval 2019 Task 2 on unsupervised semantic frame induction (QasemiZadeh et al., 2019).

Clustering Task 2 +1

How much does a word weigh? Weighting word embeddings for word sense induction

no code implementations23 May 2018 Nikolay Arefyev, Pavel Ermolaev, Alexander Panchenko

The paper describes our participation in the first shared task on word sense induction and disambiguation for the Russian language RUSSE'2018 (Panchenko et al., 2018).

Clustering Machine Translation +3

Cannot find the paper you are looking for? You can Submit a new open access paper.